⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@JasMehta08
Copy link
Contributor

This Pull request:

Changes or fixes:

This PR implements new I/O performance metrics in TFile to assist users in analyzing data access patterns, particularly for optimizing jobs on high-latency storage systems (e.g., EOS, XRootD).

  • Added protected members fSumSkip and fLastReadEnd to TFile.h (transient) to track seek distances.

  • Instrumented TFile::ReadBuffer in TFile.cxx to calculate the "skip distance" (absolute difference between the current position and the end of the previous read).

  • Implemented public getters: GetSparseness() and GetRandomness().

  • Metrics Definitions:

    • Sparseness: Total Bytes Read / Total File Size. Indicates how much of the file was actually consumed.

    • Randomness: Sum(Skip Distances) / Total Bytes Read. A higher value indicates more seeking per byte read (inefficient access).

    • Transactions: Uses existing GetReadCalls().

  • Updated TFile::Print() to display these three metrics (Transactions, Sparseness, Randomness) when the file is open.

Verification :

void test_io_metrics() {
   const char* filename = "io_metrics_test.root";
   auto f = TFile::Open(filename, "RECREATE");
   TTree *t = new TTree("t", "test tree");
   int val;
   t->Branch("val", &val, "val/I");
   for(int i=0; i<1000; i++) { val = i; t->Fill(); }
   t->Write();
   f->Close();
   delete f;

   std::cout << "\n Test 1: Sequential Read" << std::endl;
   auto fSeq = TFile::Open(filename);
   TTree *tSeq = (TTree*)fSeq->Get("t");
   
   tSeq->SetCacheSize(1000000); 
   tSeq->AddBranchToCache("*");
   
   int count = 0;
   TTreeReader rSeq(tSeq);
   TTreeReaderValue<int> vSeq(rSeq, "val");
   while(rSeq.Next()) { count++; }
   
   fSeq->Print();
   
   if (fSeq->IsA()->GetMethodAllAny("GetRandomness")) {
       double randSeq = fSeq->GetRandomness();
       std::cout << "Sequential Randomness: " << randSeq << std::endl;
       
       std::cout << "\n Test 2: Random Jump Read" << std::endl;
       auto fRand = TFile::Open(filename);
       TTree *tRand = (TTree*)fRand->Get("t");
       
       tRand->SetCacheSize(0); 
       
       tRand->GetEntry(0);
       tRand->GetEntry(900);
       tRand->GetEntry(50);
       
       fRand->Print();

       double randRand = fRand->GetRandomness();
       std::cout << "Random Randomness: " << randRand << std::endl;
       
       if (randRand > randSeq) {
          std::cout << "SUCCESS" << std::endl;
       } else {
          std::cout << "FAILURE" << std::endl;
       }
   }
}

Output :

Processing ../test_io_metrics.C...

 Test 1: Sequential Read
TFile: name=io_metrics_test.root, title=, option=READ
  IO Performance : Transactions=4     , Sparseness=0.7817, Randomness=0.0000
******************************************************************************
*Tree    :t         : test tree                                              *
*Entries :     1000 : Total =            4885 bytes  File  Size =       1869 *
*        :          : Tree compression factor =   2.69                       *
******************************************************************************
*Br    0 :val       : val/I                                                  *
*Entries :     1000 : Total  Size=       4544 bytes  File Size  =       1510 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   2.69     *
*............................................................................*
Sequential Randomness: 0

 Test 2: Random Jump Read
TFile: name=io_metrics_test.root, title=, option=READ
  IO Performance : Transactions=5     , Sparseness=1.0000, Randomness=0.0341
******************************************************************************
*Tree    :t         : test tree                                              *
*Entries :     1000 : Total =            4889 bytes  File  Size =       1869 *
*        :          : Tree compression factor =   2.69                       *
******************************************************************************
*Br    0 :val       : val/I                                                  *
*Entries :     1000 : Total  Size=       4548 bytes  File Size  =       1510 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   2.69     *
*............................................................................*
Random Randomness: 0.0341188
SUCCESS

Checklist:

  • tested changes locally

This PR fixes #20853

@JasMehta08 JasMehta08 requested a review from pcanal as a code owner January 16, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ntuple] add several useful performance metrics

3 participants