In [1]:
%run initialise_pyark.py
POST https://bio-test-cva.gel.zone/cva/api/0/authentication?
Response time : 12 ms
pyark version 4.0.4

Select cases based on a given segregation pattern

We store the counts of report events having each segregation pattern in the case in the field countTieringSegregationPatterns. We can then use the filter parameter to make any arithmetic filtering on those values. The main difficulty to use the filter parameter is that you need to know the underlying name of the field in the database you want to filter by; as opposed to other filters exposed in the REST API which are available in Swagger which happens to be the documentation of the REST API.

In [2]:
cases_client.count(program="rare_disease", filter="countTieringSegregationPatterns.deNovo gt 0")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieringSegregationPatterns.deNovo gt 0&count=True
Response time : 246 ms
Out[2]:
8051
In [3]:
cases_client.count(program="rare_disease", filter="countTieringSegregationPatterns.MitochondrialGenome gt 0")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieringSegregationPatterns.MitochondrialGenome gt 0&count=True
Response time : 208 ms
Out[3]:
4312
In [4]:
cases_client.count(program="rare_disease", filter="countTieringSegregationPatterns.UniparentalIsodisomy gt 0")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieringSegregationPatterns.UniparentalIsodisomy gt 0&count=True
Response time : 200 ms
Out[4]:
5037

Fetch the cases

Once we have the query to select our cohort we can fetch those cases. In order not to fetch all at once we need to use the parameter limit to determine the number of cases to be fetched in one single call. The limit maximum value is 200, any value greater than that will be considered as the maximum value. We can use the parameter as_data_frame=True to fetch the cases in a Pandas data frame or otherwise they will be returned in a native Python list of dictionaries.

For this example we will drill down into a smaller set of cases for the purpose of simplicity in the size of the data by adding some additional filtering.

In [5]:
cases_client.count(program="rare_disease", filter="countTieringSegregationPatterns.deNovo gt 150")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieringSegregationPatterns.deNovo gt 150&count=True
Response time : 231 ms
Out[5]:
1232
In [6]:
# fetch cases in batches of 100
denovo_cases_generator = cases_client.get_cases(program="rare_disease", hasPositiveDx=True, filter="countTieringSegregationPatterns.deNovo gt 150", limit=100, as_data_frame=True)
In [7]:
# fetch the first batch
denovo_cases = next(denovo_cases_generator)
denovo_cases[["identifier", "version", "countTiered", "tieringVersion", "hasExitQuestionnaire", "hasPositiveDx"]].head()
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&hasPositiveDx=True&filter=countTieringSegregationPatterns.deNovo gt 150&limit=100&include=__all
Response time : 435 ms
Out[7]:
identifier version countTiered tieringVersion hasExitQuestionnaire hasPositiveDx
_index
0 26401 1 19 1.0.0 True True
1 22476 1 259 0.5.0 True True
2 33166 1 255 1.0.0 True True
3 11679 1 34 0.5.0 True True
4 39732 1 29 1.0.14 True True
In [8]:
# get them all
for c in denovo_cases_generator:
    denovo_cases = denovo_cases.append(c)
In [9]:
denovo_cases.shape
Out[9]:
(29, 1945)

Extract relevant information from cases

Now extract relevant information like case ids or count of tiered variants.

In [10]:
denovo_cases[["identifier", "version", "countTiered", "tieringVersion", "hasExitQuestionnaire", "hasPositiveDx"]].head()
Out[10]:
identifier version countTiered tieringVersion hasExitQuestionnaire hasPositiveDx
_index
0 26401 1 19 1.0.0 True True
1 22476 1 259 0.5.0 True True
2 33166 1 255 1.0.0 True True
3 11679 1 34 0.5.0 True True
4 39732 1 29 1.0.14 True True

Find cases where a given segregation pattern was tier 1 or 2

This information is not available in the case entity so we need to query the report events for each case in order to infer this information.

In [11]:
# this makes one query per case
denovo_cases["countDeNovoTier12"] = denovo_cases.apply(lambda c: report_events_client.count(caseId=c.identifier, caseVersion=c.version, segregationPattern="deNovo", tiers=["TIER1", "TIER2"]), axis=1)
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26401&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=22476&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 15 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=33166&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 13 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=11679&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=39732&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26726&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 18 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=29750&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=34553&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 13 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=32149&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 13 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=35170&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 13 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26413&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 12 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=34913&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=35822&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26964&caseVersion=2&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=36969&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26384&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26296&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 12 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26809&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 9 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=40802&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 9 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=30280&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 14 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=34173&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=32265&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 10 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=27438&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=37565&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 9 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=39378&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 11 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=32661&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 16 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=18494&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 9 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=34965&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 6 ms
GET https://bio-test-cva.gel.zone/cva/api/0/report-events?caseId=26257&caseVersion=1&segregationPattern=deNovo&tiers=TIER1&tiers=TIER2&count=True
Response time : 9 ms
In [12]:
denovo_cases[denovo_cases["countDeNovoTier12"] > 0].shape
Out[12]:
(19, 1946)
In [13]:
denovo_cases[denovo_cases["countDeNovoTier12"] > 0][["identifier", "version", "countTiered", "countDeNovoTier12", "hasExitQuestionnaire", "hasPositiveDx"]].head()
Out[13]:
identifier version countTiered countDeNovoTier12 hasExitQuestionnaire hasPositiveDx
_index
0 26401 1 19 2 True True
2 33166 1 255 1 True True
4 39732 1 29 2 True True
6 29750 1 43 1 True True
7 34553 1 13 3 True True

Select cases based on the number of variants of a certain type

We store the number of tiered variants by variant type as follows:

"countTieredByVariantType" : {
        "MNV" : 0,
        "DELETION" : 11,
        "INSERTION" : 6,
        "INDEL" : 17,
        "SNV" : 247
    },

We can fetch cases with an unusual number of variants of a certain type querying this data structure.

NOTE: for cases having multiple panels applied the same variant may be reported multiple times, but the counts above are unique, meaning that the same variant is not counted more than once.

In [14]:
cases_client.count(program="rare_disease", filter="countTieredByVariantType.INSERTION gt 50")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieredByVariantType.INSERTION gt 50&count=True
Response time : 173 ms
Out[14]:
16
In [15]:
cases_client.count(program="rare_disease", filter="countTieredByVariantType.DELETION gt 50")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&filter=countTieredByVariantType.DELETION gt 50&count=True
Response time : 188 ms
Out[15]:
0

We can look at the distribution of these values in a given cohort previously downloaded.

In [16]:
# NOTE: we are missing the count of panels as precomputed value in cases, we will add it soon...
denovo_cases["countPanels"] = denovo_cases["reportEventsAnalysisPanels"].apply(lambda p: len(p))
In [17]:
denovo_cases[["countPanels", "countTieredByVariantType.DELETION", "countTieredByVariantType.INSERTION", "countTieredByVariantType.SNV"]]
Out[17]:
countPanels countTieredByVariantType.DELETION countTieredByVariantType.INSERTION countTieredByVariantType.SNV
_index
0 6 0 1 17
1 7 3 15 241
2 4 10 7 235
3 8 2 2 30
4 3 1 1 25
5 6 1 1 33
6 10 5 1 35
7 5 2 1 8
8 4 9 18 234
9 5 2 2 20
10 8 0 0 31
11 10 4 2 15
12 5 2 0 27
13 5 1 4 20
14 5 1 1 14
15 6 1 4 32
16 4 7 14 194
17 6 2 2 57
18 3 3 0 23
19 13 0 1 20
20 7 1 3 12
21 3 0 0 18
22 7 1 0 19
23 6 1 0 41
24 7 1 0 27
25 4 11 13 248
26 5 1 1 36
27 2 1 0 38
28 3 0 1 14
In [18]:
denovo_cases["countTieredByVariantType.DELETION"].plot.kde()
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fa0197de438>
In [19]:
denovo_cases["countTieredByVariantType.INSERTION"].plot.kde()
Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fa0198349e8>

Select cases based on the clinical data and case status

There are a number of fields related to the clinical data and the case status: hasCanonicalTrio, hasFatherSequenced, hasMotherSequenced, probandYearOfBirth, probandSex, probandKaryotipicSex, probandEstimatedAgeAtAnalysis, probandAgeOfOnset, probandNumberDisorders, hasPositiveDx, hasNegativeDx, caseSolvedFamily, segregationQuestion, hasActionable, hasConfirmationDecision, hasPositiveConfirmationOutcome, countProbandPresentPhenotypes, hasClinicalReport, hasExitQuestionnaire, countParticipants, countSamples

In [20]:
cases_client.count(program="rare_disease", hasCanonicalTrio=True, hasPositiveDx=True)
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&hasCanonicalTrio=True&hasPositiveDx=True&count=True
Response time : 162 ms
Out[20]:
711
In [21]:
cases_client.count(program="rare_disease", hasCanonicalTrio=True, hasPositiveDx=False)
GET https://bio-test-cva.gel.zone/cva/api/0/cases?program=rare_disease&hasCanonicalTrio=True&hasPositiveDx=False&count=True
Response time : 178 ms
Out[21]:
11343
In [22]:
# get summary of cases to compare solved versus unsolved cases with a clinical trio
cases_client.get_summary(params_list=[
    {'program':'rare_disease', 'hasCanonicalTrio':True, 'hasPositiveDx':True},
    {'program':'rare_disease', 'hasCanonicalTrio':True, 'hasPositiveDx':False}], 
                         as_data_frame=True)[['diagnosticRate', 'countCases']]
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&hasCanonicalTrio=True&hasPositiveDx=True
Response time : 447 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&hasCanonicalTrio=True&hasPositiveDx=False
Response time : 898 ms
Out[22]:
diagnosticRate countCases
program hasCanonicalTrio hasPositiveDx
rare_disease True True 1.000 711
False 0.000 11343
In [23]:
# get summary of cases with different number of participants in the family
cases_client.get_summary(params_list=[
    {'program':'rare_disease', 'filter':'countParticipants eq 1'},
    {'program':'rare_disease', 'filter':'countParticipants eq 2'},
    {'program':'rare_disease', 'filter':'countParticipants eq 3'},
    {'program':'rare_disease', 'filter':'countParticipants eq 4'},
    {'program':'rare_disease', 'filter':'countParticipants eq 5'},
    {'program':'rare_disease', 'filter':'countParticipants eq 6'}
], as_data_frame=True)[['diagnosticRate', 'countCases']]
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 1
Response time : 838 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 2
Response time : 590 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 3
Response time : 892 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 4
Response time : 402 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 5
Response time : 348 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countParticipants eq 6
Response time : 353 ms
Out[23]:
diagnosticRate countCases
program filter
rare_disease countParticipants eq 1 0.103 11898
countParticipants eq 2 0.125 5425
countParticipants eq 3 0.185 11708
countParticipants eq 4 0.231 1195
countParticipants eq 5 0.431 249
countParticipants eq 6 0.500 47
In [24]:
# get summary of cases with different proband age ranges
cases_client.get_summary(params_list=[
    {'program':'rare_disease', 'filter':'probandEstimatedAgeAtAnalysis lt 2'},
    {'program':'rare_disease', 
     'filter':'probandEstimatedAgeAtAnalysis lt 10 and probandEstimatedAgeAtAnalysis ge 2'},
    {'program':'rare_disease', 
     'filter':'probandEstimatedAgeAtAnalysis lt 18 and probandEstimatedAgeAtAnalysis ge 10'},
    {'program':'rare_disease', 'filter':'probandEstimatedAgeAtAnalysis ge 18'}
], as_data_frame=True)[['diagnosticRate', 'countCases']]
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandEstimatedAgeAtAnalysis lt 2
Response time : 441 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandEstimatedAgeAtAnalysis lt 10 and probandEstimatedAgeAtAnalysis ge 2
Response time : 797 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandEstimatedAgeAtAnalysis lt 18 and probandEstimatedAgeAtAnalysis ge 10
Response time : 681 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandEstimatedAgeAtAnalysis ge 18
Response time : 1226 ms
Out[24]:
diagnosticRate countCases
program filter
rare_disease probandEstimatedAgeAtAnalysis lt 2 0.302 176
probandEstimatedAgeAtAnalysis lt 10 and probandEstimatedAgeAtAnalysis ge 2 0.198 7350
probandEstimatedAgeAtAnalysis lt 18 and probandEstimatedAgeAtAnalysis ge 10 0.168 4742
probandEstimatedAgeAtAnalysis ge 18 0.127 18000
In [25]:
# get summary of cases with different count of phenotypes
cases_client.get_summary(params_list=[
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 1'},
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 2'},
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 3'},
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 4'},
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 5'},
    {'program':'rare_disease', 'filter':'countProbandPresentPhenotypes eq 6'}
], as_data_frame=True)[['diagnosticRate', 'countCases']]
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 1
Response time : 466 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 2
Response time : 494 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 3
Response time : 497 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 4
Response time : 478 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 5
Response time : 461 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=countProbandPresentPhenotypes eq 6
Response time : 469 ms
Out[25]:
diagnosticRate countCases
program filter
rare_disease countProbandPresentPhenotypes eq 1 0.103 2901
countProbandPresentPhenotypes eq 2 0.113 3567
countProbandPresentPhenotypes eq 3 0.117 3337
countProbandPresentPhenotypes eq 4 0.136 2987
countProbandPresentPhenotypes eq 5 0.134 2802
countProbandPresentPhenotypes eq 6 0.141 2458
In [26]:
# get summary of cases with different values of information content
cases_client.get_summary(params_list=[
    {'program':'rare_disease', 'filter':'probandPresentPhenotypesInformationContent le 1'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 2 and probandPresentPhenotypesInformationContent gt 1'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 3 and probandPresentPhenotypesInformationContent gt 2'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 4 and probandPresentPhenotypesInformationContent gt 3'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 5 and probandPresentPhenotypesInformationContent gt 4'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 6 and probandPresentPhenotypesInformationContent gt 5'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent le 7 and probandPresentPhenotypesInformationContent gt 6'},
    {'program':'rare_disease', 
     'filter':'probandPresentPhenotypesInformationContent gt 7'}
], as_data_frame=True)[['diagnosticRate', 'countCases']]
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 1
Response time : 405 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 2 and probandPresentPhenotypesInformationContent gt 1
Response time : 402 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 3 and probandPresentPhenotypesInformationContent gt 2
Response time : 395 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 4 and probandPresentPhenotypesInformationContent gt 3
Response time : 611 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 5 and probandPresentPhenotypesInformationContent gt 4
Response time : 839 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 6 and probandPresentPhenotypesInformationContent gt 5
Response time : 787 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent le 7 and probandPresentPhenotypesInformationContent gt 6
Response time : 680 ms
GET https://bio-test-cva.gel.zone/cva/api/0/cases/summary?program=rare_disease&filter=probandPresentPhenotypesInformationContent gt 7
Response time : 667 ms
Out[26]:
diagnosticRate countCases
program filter
rare_disease probandPresentPhenotypesInformationContent le 1 0.500 4
probandPresentPhenotypesInformationContent le 4 and probandPresentPhenotypesInformationContent gt 3 0.281 3622
probandPresentPhenotypesInformationContent le 5 and probandPresentPhenotypesInformationContent gt 4 0.201 8017
probandPresentPhenotypesInformationContent le 6 and probandPresentPhenotypesInformationContent gt 5 0.152 7594
probandPresentPhenotypesInformationContent le 7 and probandPresentPhenotypesInformationContent gt 6 0.116 5344
probandPresentPhenotypesInformationContent gt 7 0.095 5936

Select cases by their genomic data

In [27]:
## Select cases with a positive Dx by diagnostic gene
cases_client.count(
    hasPositiveDx=True,
    filter="classifiedGenes.pathogenic_variant eq 'ENSG00000008710' or classifiedGenes.likely_pathogenic_variant eq 'ENSG00000008710'")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?hasPositiveDx=True&filter=classifiedGenes.pathogenic_variant eq 'ENSG00000008710' or classifiedGenes.likely_pathogenic_variant eq 'ENSG00000008710'&count=True
Response time : 266 ms
Out[27]:
91
In [28]:
## Select cases with a positive Dx from a tier 3
cases_client.count(
    hasPositiveDx=True,
    filter="countTieredAndClassified.TIER3.pathogenic_variant gt 0 and " +
        "countTieredAndClassified.TIER1.pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER1.likely_pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER2.pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER2.likely_pathogenic_variant eq 0")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?hasPositiveDx=True&filter=countTieredAndClassified.TIER3.pathogenic_variant gt 0 and countTieredAndClassified.TIER1.pathogenic_variant eq 0 and countTieredAndClassified.TIER1.likely_pathogenic_variant eq 0 and countTieredAndClassified.TIER2.pathogenic_variant eq 0 and countTieredAndClassified.TIER2.likely_pathogenic_variant eq 0&count=True
Response time : 249 ms
Out[28]:
57
In [29]:
cases_client.count(
    hasPositiveDx=True,
    filter="countTieredAndClassified.TIER3.likely_pathogenic_variant gt 0 and " +
        "countTieredAndClassified.TIER1.pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER1.likely_pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER2.pathogenic_variant eq 0 and " +
        "countTieredAndClassified.TIER2.likely_pathogenic_variant eq 0")
GET https://bio-test-cva.gel.zone/cva/api/0/cases?hasPositiveDx=True&filter=countTieredAndClassified.TIER3.likely_pathogenic_variant gt 0 and countTieredAndClassified.TIER1.pathogenic_variant eq 0 and countTieredAndClassified.TIER1.likely_pathogenic_variant eq 0 and countTieredAndClassified.TIER2.pathogenic_variant eq 0 and countTieredAndClassified.TIER2.likely_pathogenic_variant eq 0&count=True
Response time : 244 ms
Out[29]:
69