Skip to contents

Sysrev users assign millions of labels to documents. The rsr package let’s you access and analyze that data. This demo shows how to access data from a review of prostate cancer biomarkers sysrev.com/p/81395.

get_answers_list(81395) organizes data from the review (p/81395) into a list of tables.

tbls <- get_answers(81395) |> list_answers()
#> list(basic=<tbl>,biomarker=<tbl>,...)

The basic table links articles (eg. aid 1781738) to basic label data (logical, string, category). In this project, a basic label describes article trial identifiers (eg NCT01682772):

tbls$basic |> 
  mutate(nct=unlist(`NCT Trial ID`)) |> 
  select(aid,nct)
#>        aid nct        
#> 1 11781750 NCT01682772
#> 2 11781738 NCT02854436

The other named values in get_answer_list are group labels, which are themselves tables. Here, the biomarker table describes which trials evaluated which genes.

tbls$biomarker |> 
  mutate(biomarker.name=unlist(biomarker.name)) |> 
  select(aid, biomarker.name)
#>        aid biomarker.name
#> 1 11781738 BRCA1 
#> 2 11781738 BRCA2 

Extracted tables can be joined by article aid. Here, joined basic/biomarker tables link trials and biomarkers. ggplot2::geom_tile can then quickly visualize which trials use which biomarkers.

join.tb = tbls$basic |> 
  inner_join(tbls$biomarker,by="aid")

# ggplot(join.tb,
#   aes(x=bmkr, y=study, fill=elig)) + 
#   geom_tile() + …

Now you know how to get open access review data on sysrev. Check out the rsr reference for more documentation.