Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 1;481(3):491-508.
doi: 10.1097/CORR.0000000000002282. Epub 2022 Jun 21.

Is the Number of National Database Research Studies in Musculoskeletal Sarcoma Increasing, and Are These Studies Reliable?

Affiliations

Is the Number of National Database Research Studies in Musculoskeletal Sarcoma Increasing, and Are These Studies Reliable?

Joshua M Lawrenz et al. Clin Orthop Relat Res. .

Abstract

Background: Large national databases have become a common source of information on patterns of cancer care in the United States, particularly for low-incidence diseases such as sarcoma. Although aggregating information from many hospitals can achieve statistical power, this may come at a cost when complex variables must be abstracted from the medical record. There is a current lack of understanding of the frequency of use of the Surveillance, Epidemiology, and End Results (SEER) database and the National Cancer Database (NCDB) over the last two decades in musculoskeletal sarcoma research and whether their use tends to produce papers with conflicting findings.

Questions/purposes: (1) Is the number of published studies using the SEER and NCDB databases in musculoskeletal sarcoma research increasing over time? (2) What are the author, journal, and content characteristics of these studies? (3) Do studies using the SEER and the NCDB databases for similar diagnoses and study questions report concordant or discordant key findings? (4) Are the administrative data reported by our institution to the SEER and the NCDB databases concordant with the data in our longitudinally maintained, physician-run orthopaedic oncology dataset?

Methods: To answer our first three questions, PubMed was searched from 2001 through 2020 for all studies using the SEER or the NCDB databases to evaluate sarcoma. Studies were excluded from the review if they did not use these databases or studied anatomic locations other than the extremities, nonretroperitoneal pelvis, trunk, chest wall, or spine. To answer our first question, the number of SEER and NCDB studies were counted by year. The publication rate over the 20-year span was assessed with simple linear regression modeling. The difference in the mean number of studies between 5-year intervals (2001-2005, 2006-2010, 2011-2015, 2016-2020) was also assessed with Student t-tests. To answer our second question, we recorded and summarized descriptive data regarding author, journal, and content for these studies. To answer our third question, we grouped all studies by diagnosis, and then identified studies that shared the same diagnosis and a similar major study question with at least one other study. We then categorized study questions (and their associated studies) as having concordant findings, discordant findings, or mixed findings. Proportions of studies with concordant, discordant, or mixed findings were compared. To answer our fourth question, a coding audit was performed assessing the concordance of nationally reported administrative data from our institution with data from our longitudinally maintained, physician-run orthopaedic oncology dataset in a series of patients during the past 3 years. Our orthopaedic oncology dataset is maintained on a weekly basis by the senior author who manually records data directly from the medical record and sarcoma tumor board consensus notes; this dataset served as the gold standard for data comparison. We compared date of birth, surgery date, margin status, tumor size, clinical stage, and adjuvant treatment.

Results: The number of musculoskeletal sarcoma studies using the SEER and the NCDB databases has steadily increased over time in a linear regression model (β = 2.51; p < 0.001). The mean number of studies per year more than tripled during 2016-2020 compared with 2011-2015 (39 versus 13 studies; mean difference 26 ± 11; p = 0.03). Of the 299 studies in total, 56% (168 of 299) have been published since 2018. Nineteen institutions published more than five studies, and the most studies from one institution was 13. Orthopaedic surgeons authored 35% (104 of 299) of studies, and medical oncology journals published 44% (130 of 299). Of the 94 studies (31% of total [94 of 299]) that shared a major study question with at least one other study, 35% (33 of 94) reported discordant key findings, 29% (27 of 94) reported mixed key findings, and 44% (41 of 94) reported concordant key findings. Both concordant and discordant groups included papers on prognostic factors, demographic factors, and treatment strategies. When we compared nationally reported administrative data from our institution with our orthopaedic oncology dataset, we found clinically important discrepancies in adjuvant treatment (19% [15 of 77]), tumor size (21% [16 of 77]), surgery date (23% [18 of 77]), surgical margins (38% [29 of 77]), and clinical stage (77% [59 of 77]).

Conclusion: Appropriate use of databases in musculoskeletal cancer research is essential to promote clear interpretation of findings, as almost two-thirds of studies we evaluated that asked similar study questions produced discordant or mixed key findings. Readers should be mindful of the differences in what each database seeks to convey because asking the same questions of different databases may result in different answers depending on what information each database captures. Likewise, differences in how studies determine which patients to include or exclude, how they handle missing data, and what they choose to emphasize may result in different messages getting drawn from large-database studies. Still, given the rarity and heterogeneity of sarcomas, these databases remain particularly useful in musculoskeletal cancer research for nationwide incidence estimations, risk factor/prognostic factor assessment, patient demographic and hospital-level variable assessment, patterns of care over time, and hypothesis generation for future prospective studies.

Level of evidence: Level III, therapeutic study.

PubMed Disclaimer

Conflict of interest statement

Each author certifies that there are no funding or commercial associations (consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) that might pose a conflict of interest in connection with the submitted article related to the author or any immediate family members. All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.

Figures

Fig. 1
Fig. 1
This flowchart shows how we reviewed the studies from PubMed and the process for arriving at the studies we subsequently reviewed. aIn total, there were 301 analyses of the NCDB or SEER databases within 299 discrete studies (two studies analyzed both NCDB and SEER).
Fig. 2
Fig. 2
This flowchart shows how we reviewed musculoskeletal sarcoma studies for similar study questions and then subsequently classified some as having concordant, discordant, or mixed key findings. aConcordant classification was defined as any two or more studies with similar key findings and similar messages. Discordant classification was defined any two or more studies with different key findings and different messages. Mixed classification was defined any two or more studies that consisted of either: (1) different key findings and similar messages or (2) different key findings in some studies compared with secondary findings in other studies and different messages. bEight studies were included in more than one study question. Five studies had concordant findings for one study question and discordant findings for another study question. One study had mixed findings for two study questions, one study had mixed findings for one study question and discordant findings for another study question, and one study had mixed findings for one study question and concordant findings for another study question.
Fig. 3
Fig. 3
This is a graphic representation of how publications have increased by year from 2001 to 2020 for the SEER and NCDB.

Comment in

References

    1. Abarca T, Gao Y, Monga V, Tanas MR, Milhem MM, Miller BJ. Improved survival for extremity soft tissue sarcoma treated in high-volume facilities. J Surg Oncol. 2018;117:1479-1486. - PMC - PubMed
    1. Alamanda VK, Song Y, Holt GE. Effect of marital status on treatment and survival of extremity soft tissue sarcoma. Ann Oncol. 2014;25:725-729. - PMC - PubMed
    1. Alamanda VK, Song Y, Schwartz HS, Holt GE. Racial disparities in extremity soft-tissue sarcoma outcomes: a nationwide analysis. Am J Clin Oncol. 2015;38:595-599. - PubMed
    1. Amer KM, Thomson JE, Congiusta D, et al. Epidemiology, incidence, and survival of rhabdomyosarcoma subtypes: SEER and ICES database analysis. J Orthop Res. 2019;37:2226-2230. - PubMed
    1. Arshi A, Sharim J, Park DY, et al. Chondrosarcoma of the osseous spine: an analysis of epidemiology, patient outcomes, and prognostic factors using the SEER registry from 1973 to 2012. Spine (Phila Pa 1976). 2017;42:644-652. - PMC - PubMed