[io] Implement TFile performance metrics (Sparseness, Randomness) #20916
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This Pull request:
Changes or fixes:
This PR implements new I/O performance metrics in TFile to assist users in analyzing data access patterns, particularly for optimizing jobs on high-latency storage systems (e.g., EOS, XRootD).
Added protected members fSumSkip and fLastReadEnd to TFile.h (transient) to track seek distances.
Instrumented TFile::ReadBuffer in TFile.cxx to calculate the "skip distance" (absolute difference between the current position and the end of the previous read).
Implemented public getters: GetSparseness() and GetRandomness().
Metrics Definitions:
Sparseness: Total Bytes Read / Total File Size. Indicates how much of the file was actually consumed.
Randomness: Sum(Skip Distances) / Total Bytes Read. A higher value indicates more seeking per byte read (inefficient access).
Transactions: Uses existing GetReadCalls().
Updated TFile::Print() to display these three metrics (Transactions, Sparseness, Randomness) when the file is open.
Verification :
Output :
Checklist:
This PR fixes #20853