When initialising the client it is handy to set the log level to INFO in order to see the times that each query takes and how the URL is built.
We will be importing some system libraries, the Pandas library which is used by pyark and will be explained later, some entities available in the Genomics England models in the package protocols
and finally the pyark
client.
import getpass
import logging
import os
import sys
import pandas as pd
from collections import defaultdict, OrderedDict
import pyark
from pyark.cva_client import CvaClient
from protocols.protocol_7_2.reports import Program, Tier, Assembly
from protocols.protocol_7_2.cva import ReportEventType
# sets logging messages so the URLs that are called get printed
logging.basicConfig(level=logging.INFO)
You need three things to initialise pyark: the CVA backend URL, your user name and your password. In this example these are loaded from environment variables.
The client gets a token which will contain your authorisation level. The token renews automatically if necessary. The client will also make retries in case of request failures.
# initialise CVA client and subclients
# every subclient provides access to different sets of data exposed in the API
user = os.environ.get("CVA_USER")
password = os.environ.get("CVA_PASSWORD")
url = os.environ.get("CVA_URL_BASE", "http://localhost:8090")
cva = CvaClient(url_base=url, user=user, password=password)
Once the token is obtained we will have available a number of different subclients, each of those providing access to a different CVA entity or functionality.
cases_client = cva.cases()
pedigrees_client = cva.pedigrees()
entities_client = cva.entities()
variants_client = cva.variants()
report_events_client = cva.report_events()
transactions_client = cva.transactions()
Check the version of your client as follows.
print("pyark version {}".format(pyark.VERSION))
As the simplest usage example we can count the number of entities in CVA.
# we can count the total number of cases
cases_client.count()
# or we can count the number of cases given some criteria
cases_client.count(program=Program.rare_disease, panelNames='intellectual disability')
# count the total number of report events
report_events_client.count()
# count the number of report events given some criteria
report_events_client.count(program=Program.rare_disease, type="questionnaire")
# count the total number of variants
variants_client.count()
# count the number of variants given some criteria
variants_client.count(assembly=Assembly.GRCh38, geneSymbols="BRCA2")