Solution for #71 by CalvinFang-code · Pull Request #72 · BrentLab/tfbpapi

CalvinFang-code · 2026-01-24T02:22:24Z

I've made these improvements, hoping they will be useful:

Added the _join_comparative_analyses function to _build_metadata_table to incorporate comparative datasets; _join_comparative_analyses queries the comparative dataset using SQL and then prepares for matching; _parse_composite_identifier: parses the ID from the comparative dataset for matching.
It worked successfully with Harbison, but failed with Hackett due to a case mismatch (uppercase H). Therefore, I added code to _join_comparative_analyses to try both uppercase and lowercase beginnings for the repo ID.
Added a query_dto function to specifically handle the intersection of specified binding and perturbation datasets.
I also found some inconsistencies between datasets: BrentLab/harbison_2004;harbison_2004;3
BrentLab/rossi_2021/rossi_2021_af_combined
Some use semicolons, others use slashes.

Some use uppercase, others use lowercase:
BrentLab/Hackett_2020;hackett_2020;34 BrentLab/harbison_2004;harbison_2004;3

Do we need to unify them? Or should we handle them separately with functions?
5. I failed to read the calling cards data; the program crashed several times, but I haven't found the reason yet, so I haven't continued with the analysis.

CalvinFang-code · 2026-01-24T02:27:40Z

I find this strange; this problem didn't occur in my local testing. I'll investigate what's causing this later.

cmatKhan

My inclination is this isn't worth working on further. It is making very large changes and its hard for me to follow why some of them are being made.

I would suggest that rather than continuing on with this, it would be better to take that issue and make reproducible examples of how the current parsing method fails.

cmatKhan · 2026-02-04T20:47:56Z

tfbpapi/virtual_db.py

        # Concatenate results, filling NaN for missing columns
        return pd.concat(results, ignore_index=True, sort=False)

+    def query_dto(


Functions in virtualDB shouldn't be this specific. From the point of view of how the data is stored, DTO isn't meaningfully different from spearman correlation

CalvinFang-code · 2026-02-04T22:48:57Z

Understood, so currently we should focus on identifying the cause and specific examples of the bug, rather than making these changes.

Also, is VirtualDB specifically responsible for basic functionalities? Is it necessary to encapsulate functions like retrieving DTO data?

cmatKhan · 2026-02-04T22:54:15Z

yes, and I think one of the problems is the way the comparative analysis dataset is configured. I'm playing with moving it out to the same level as the other repos, and adding a "links to" field which lists other configured datasets.

CalvinFang-code added 2 commits January 23, 2026 20:06

change for BrentLab#71

1aece93

3 BrentLab#71

bf67fbb

V2 solution for BrentLab#71

909e0dd

cmatKhan reviewed Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solution for #71#72

Solution for #71#72
CalvinFang-code wants to merge 3 commits intoBrentLab:devfrom
CalvinFang-code:dev

CalvinFang-code commented Jan 24, 2026

Uh oh!

CalvinFang-code commented Jan 24, 2026

Uh oh!

cmatKhan left a comment

Uh oh!

cmatKhan Feb 4, 2026

Uh oh!

CalvinFang-code commented Feb 4, 2026

Uh oh!

cmatKhan commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CalvinFang-code commented Jan 24, 2026

Uh oh!

CalvinFang-code commented Jan 24, 2026

Uh oh!

cmatKhan left a comment

Choose a reason for hiding this comment

Uh oh!

cmatKhan Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

CalvinFang-code commented Feb 4, 2026

Uh oh!

cmatKhan commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants