Abstract
Introduction
This guideline establishes clinical practice recommendations for the diagnosis of obstructive sleep apnea (OSA) in adults and is intended for use in conjunction with other American Academy of Sleep Medicine (AASM) guidelines on the evaluation and treatment of sleep-disordered breathing in adults.Methods
The AASM commissioned a task force of experts in sleep medicine. A systematic review was conducted to identify studies, and the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) process was used to assess the evidence. The task force developed recommendations and assigned strengths based on the quality of evidence, the balance of benefits and harms, patient values and preferences, and resource use. In addition, the task force adopted foundational recommendations from prior guidelines as "good practice statements", that establish the basis for appropriate and effective diagnosis of OSA. The AASM Board of Directors approved the final recommendations.Recommendations
The following recommendations are intended as a guide for clinicians diagnosing OSA in adults. Under GRADE, a STRONG recommendation is one that clinicians should follow under most circumstances. A WEAK recommendation reflects a lower degree of certainty regarding the outcome and appropriateness of the patient-care strategy for all patients. The ultimate judgment regarding propriety of any specific care must be made by the clinician in light of the individual circumstances presented by the patient, available diagnostic tools, accessible treatment options, and resources. Good Practice Statements: Diagnostic testing for OSA should be performed in conjunction with a comprehensive sleep evaluation and adequate follow-up. Polysomnography is the standard diagnostic test for the diagnosis of OSA in adult patients in whom there is a concern for OSA based on a comprehensive sleep evaluation.Recommendations: We recommend that clinical tools, questionnaires and prediction algorithms not be used to diagnose OSA in adults, in the absence of polysomnography or home sleep apnea testing. (STRONG). We recommend that polysomnography, or home sleep apnea testing with a technically adequate device, be used for the diagnosis of OSA in uncomplicated adult patients presenting with signs and symptoms that indicate an increased risk of moderate to severe OSA. (STRONG). We recommend that if a single home sleep apnea test is negative, inconclusive, or technically inadequate, polysomnography be performed for the diagnosis of OSA. (STRONG). We recommend that polysomnography, rather than home sleep apnea testing, be used for the diagnosis of OSA in patients with significant cardiorespiratory disease, potential respiratory muscle weakness due to neuromuscular condition, awake hypoventilation or suspicion of sleep related hypoventilation, chronic opioid medication use, history of stroke or severe insomnia. (STRONG). We suggest that, if clinically appropriate, a split-night diagnostic protocol, rather than a full-night diagnostic protocol for polysomnography be used for the diagnosis of OSA. (WEAK). We suggest that when the initial polysomnogram is negative and clinical suspicion for OSA remains, a second polysomnogram be considered for the diagnosis of OSA. (WEAK).Free full text
Clinical Practice Guideline for Diagnostic Testing for Adult Obstructive Sleep Apnea: An American Academy of Sleep Medicine Clinical Practice Guideline
Abstract
Introduction:
This guideline establishes clinical practice recommendations for the diagnosis of obstructive sleep apnea (OSA) in adults and is intended for use in conjunction with other American Academy of Sleep Medicine (AASM) guidelines on the evaluation and treatment of sleep-disordered breathing in adults.
Methods:
The AASM commissioned a task force of experts in sleep medicine. A systematic review was conducted to identify studies, and the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) process was used to assess the evidence. The task force developed recommendations and assigned strengths based on the quality of evidence, the balance of benefits and harms, patient values and preferences, and resource use. In addition, the task force adopted foundational recommendations from prior guidelines as “good practice statements”, that establish the basis for appropriate and effective diagnosis of OSA. The AASM Board of Directors approved the final recommendations.
Recommendations:
The following recommendations are intended as a guide for clinicians diagnosing OSA in adults. Under GRADE, a STRONG recommendation is one that clinicians should follow under most circumstances. A WEAK recommendation reflects a lower degree of certainty regarding the outcome and appropriateness of the patient-care strategy for all patients. The ultimate judgment regarding propriety of any specific care must be made by the clinician in light of the individual circumstances presented by the patient, available diagnostic tools, accessible treatment options, and resources.
Good Practice Statements:
Diagnostic testing for OSA should be performed in conjunction with a comprehensive sleep evaluation and adequate follow-up. Polysomnography is the standard diagnostic test for the diagnosis of OSA in adult patients in whom there is a concern for OSA based on a comprehensive sleep evaluation.
Recommendations:
We recommend that clinical tools, questionnaires and prediction algorithms not be used to diagnose OSA in adults, in the absence of polysomnography or home sleep apnea testing. (STRONG)
We recommend that polysomnography, or home sleep apnea testing with a technically adequate device, be used for the diagnosis of OSA in uncomplicated adult patients presenting with signs and symptoms that indicate an increased risk of moderate to severe OSA. (STRONG)
We recommend that if a single home sleep apnea test is negative, inconclusive, or technically inadequate, polysomnography be performed for the diagnosis of OSA. (STRONG)
We recommend that polysomnography, rather than home sleep apnea testing, be used for the diagnosis of OSA in patients with significant cardiorespiratory disease, potential respiratory muscle weakness due to neuromuscular condition, awake hypoventilation or suspicion of sleep related hypoventilation, chronic opioid medication use, history of stroke or severe insomnia. (STRONG)
We suggest that, if clinically appropriate, a split-night diagnostic protocol, rather than a full-night diagnostic protocol for polysomnography be used for the diagnosis of OSA. (WEAK)
We suggest that when the initial polysomnogram is negative and clinical suspicion for OSA remains, a second polysomnogram be considered for the diagnosis of OSA. (WEAK)
Citation:
Kapur VK, Auckley DH, Chowdhuri S, Kuhlmann DC, Mehra R, Ramar K, Harrod CG. Clinical practice guideline for diagnostic testing for adult obstructive sleep apnea: an American Academy of Sleep Medicine clinical practice guideline. J Clin Sleep Med. 2017;13(3):479–504.
INTRODUCTION
The diagnosis of obstructive sleep apnea (OSA) was previously addressed in two American Academy of Sleep Medicine (AASM) guidelines, the “Practice Parameters for the Indications for Polysomnography and Related Procedures: An Update for 2005” and “Clinical Guidelines for the Use of Unattended Portable Monitors in the Diagnosis of Obstructive Sleep Apnea in Adult Patients (2007).”1,2 The AASM commissioned a task force (TF) of content experts to develop an updated clinical practice guideline (CPG) on this topic. The objectives of this CPG are to combine and update information from prior guideline documents regarding the diagnosis of OSA, including the optimal circumstances under which attended in-laboratory polysomnography (heretofore referred to as “polysomnography” or “PSG”) or home sleep apnea testing (HSAT) should be performed.
BACKGROUND
The term sleep-disordered breathing (SDB) encompasses a range of disorders, with most falling into the categories of OSA, central sleep apnea (CSA) or sleep-related hypoventilation. This paper focuses on diagnostic issues related to the diagnosis of OSA, a breathing disorder characterized by narrowing of the upper airway that impairs normal ventilation during sleep. Recent reviews on the evaluation and management of CSA and sleep-related hypoventilation have been published separately by the AASM.3–5
The prevalence of OSA varies significantly based on the population being studied and how OSA is defined (e.g., testing methodology, scoring criteria used, and apnea-hypopnea index [AHI] threshold). The prevalence of OSA has been estimated to be 14% of men and 5% of women, in a population-based study utilizing an AHI cutoff of ≥ 5 events/h (hypopneas associated with 4% oxygen desaturations) combined with clinical symptoms to define OSA.6 OSA may impact a larger proportion of the population than indicated by these numbers, as the definition of AHI used in this study was restrictive and did not consider hypopneas that disrupt sleep without oxygen de-saturation. In addition, the estimate excludes individuals with an elevated AHI who do not have sleepiness but who may nevertheless be at risk for adverse consequences such as cardiovascular disease.7–10 In some populations, the prevalence of OSA is substantially higher than this estimate, for example, in patients being evaluated for bariatric surgery (estimated range of 70% to 80%)11 or in patients who have had a transient ischemic attack or stroke (estimated range of 60% to 70%).12 Other disease-specific populations found to have increased rates of OSA include, but are not limited to, patients with coronary artery disease, congestive heart failure, arrhythmias, refractory hypertension, type 2 diabetes, and polycystic ovarian disease.13,14
The consequences of untreated OSA are wide ranging and are postulated to result from the fragmented sleep, intermittent hypoxia and hypercapnea, intrathoracic pressure swings, and increased sympathetic nervous activity that accompanies disordered breathing during sleep. Individuals with OSA often feel unrested, fatigued, and sleepy during the daytime. They may suffer from impairments in vigilance, concentration, cognitive function, social interactions and quality of life (QOL). These declines in daytime function can translate into higher rates of job-related and motor vehicle accidents.15 Patients with untreated OSA may be at increased risk of developing cardiovascular disease, including difficult-to-control blood pressure, coronary artery disease, congestive heart failure, arrhythmias and stroke.16 OSA is also associated with metabolic dysregulation, affecting glucose control and risk for diabetes.17 Undiagnosed and untreated OSA is a significant burden on the healthcare system, with increased healthcare utilization seen in those with untreated OSA,18 highlighting the importance of early and accurate diagnosis of this common disorder.
Recognizing and treating OSA is important for a number of reasons. The treatment of OSA has been shown to improve QOL, lower the rates of motor vehicle accidents, and reduce the risk of the chronic health consequences of untreated OSA mentioned above.19 There are also data supporting a decrease in healthcare utilization and cost following the diagnosis and treatment of OSA.20 However, there are challenges and uncertainties in making the diagnosis and a number of questions remain unanswered.
Individuals with OSA can also have other sleep disorders that may be related to or unrelated to OSA. Co-morbid insomnia has been found to be a frequent problem in patients with OSA.21 It is also possible that undiagnosed OSA may be masquerading as another sleep disorder, such as REM Behavior Disorder.22 Therefore, when OSA is suspected, a comprehensive sleep evaluation is important to ensure appropriate diagnostic testing is performed to address OSA, as well as other comorbid sleep complaints.
The diagnosis of OSA involves measuring breathing during sleep. The evolution of measurement techniques and definitions of abnormalities justifies updating the guidelines regarding diagnostic testing, but also complicates the evaluation and summary of evidence gathered from older research studies that have included diagnostic tests with diverse sensor types and scored respiratory events using different definitions. The third edition of the International Classification of Sleep Disorders (ICSD-3) defines OSA as a PSG-determined obstructive respiratory disturbance index (RDI) ≥ 5 events/h associated with the typical symptoms of OSA (e.g., unrefreshing sleep, daytime sleepiness, fatigue or insomnia, awakening with a gasping or choking sensation, loud snoring, or witnessed apneas), or an obstructive RDI ≥ 15 events/h (even in the absence of symptoms).23 In addition to apneas and hypopneas that are included in the AHI, the RDI includes respiratory effort-related arousals (RERAs). The scoring of respiratory events is defined in The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications, Version 2.3 (AASM Scoring Manual).24 However, it should be noted that there is variability in the definition of a hypopnea event. The AASM Scoring Manual recommended definition requires that changes in flow be associated with a 3% oxygen desaturation or a cortical arousal, but allows an alternative definition that requires association with a 4% oxygen desatu-ration without consideration of cortical arousals. Depending on which definition is used, the AHI may be considerably different in a given individual.25–27 The discrepancy between these and other hypopnea definitions used in research studies introduces complexity in the evaluation of evidence regarding the diagnosis of OSA.
Due to the high prevalence of OSA, there is significant cost associated with evaluating all patients suspected of having OSA with PSG (currently considered the gold standard diagnostic test). Further, there also may be limited access to in-laboratory testing in some areas. HSAT, which has limitations, is an alternative method to diagnose OSA in adults, and may be less costly and more efficient in some populations. This guideline addresses some of these issues using an evidence-based approach.
There are potential disadvantages to using HSAT, relative to PSG, because of the differences in the physiologic parameters being collected and the availability of personnel to adjust sensors when needed. The sensor technology used by HSAT devices varies considerably by the number and type of sensors that are utilized. Traditionally, sleep studies have been categorized as Type I, Type II, Type III or Type IV. Unattended studies fall into categories Type II through Type IV. Type II studies use the same monitoring sensors as full PSGs (Type I) but are unattended, and thus can be performed outside of the sleep laboratory. Type III studies use devices that measure limited cardiopulmonary parameters; two respiratory variables (e.g., effort to breathe, airflow), oxygen saturation, and a cardiac variable (e.g., heart rate or electrocardiogram). Type IV studies utilize devices that measure only 1 or 2 parameters, typically oxygen saturation and heart rate, or in some cases, just air flow. This classification of sleep study devices fails to consider new technologies, such as peripheral arterial tonometry (PAT), and thus an alternative classification scheme has been proposed: the SCOPER classification, which incorporates Sleep, Cardiovascular, Oximetry, Position, Effort and Respiratory parameters.28 The SCOPER system allows for the inclusion of technologies such as PAT. However, due to the complexity of the SCOPER classification, and lack of familiarity with it amongst practicing clinicians, the TF elected to refer to HSAT devices by the traditional Type II through Type IV classification system, and to identify specific devices with technology outside of this schema when appropriate. Regardless, as can be recognized by both classifications, HSAT devices in comparison to attended studies raise risk for technical failures due to a lack of real-time monitoring, and have inherent limitations resulting from the inability of most devices to define sleep versus wake. Another potential disadvantage is that positive airway pressure (PAP) cannot be initiated during a HSAT, but can be initiated during a PSG if needed.
Measurement error is inevitable in HSAT, compared against PSG, as standard sleep staging channels are not typically monitored in HSAT (e.g., EEG, EOG and EMG monitoring are not typically performed), which results in use of the recording time rather than sleep time to define the denominator of the respiratory event index (REI; the term used to represent the frequency of apneas and hypopneas derived from HSAT). HSAT devices that use conventional sensors are unable to detect hypopneas only associated with cortical arousals, which are included in the recommended AHI scoring rule in the AASM Scoring Manual.24 Sensor dislodgement and poor quality signal during HSAT are additional contributors to the measurement error of the REI. All these factors can result in the underestimation of the “true” AHI, and may result in the need for repeated studies due to inadequate data for diagnosis.
As a diagnostic guideline, our systematic review and recommendations incorporate evidence regarding the accuracy of HSAT for diagnosing OSA. However, diagnosis occurs in the context of management of a patient within the healthcare system, and therefore, outcomes other than diagnostic accuracy are relevant in the evaluation of management strategies. These include the impact on clinical outcomes (e.g., sleepiness, QOL, morbidity, mortality, adherence to therapy) and efficiency of care (e.g., time to test, time to treatment, costs). Therefore, these outcomes are also considered in the formulation of the current guideline.
Prior AASM guidelines1,2 on the diagnosis of OSA included statements that the TF determined were no longer pertinent. Thus, these statements were not addressed in the current update. Moreover, prior guidelines included consensus statements that had not been specifically evaluated in clinical studies. Despite this limitation, two of these statements were adopted in the current guideline as foundational statements that underpin the provision of high quality care for the diagnosis of OSA (see good practice statements). The scope of this guideline did not include a comprehensive update of technical specification for diagnostic testing for OSA. Nevertheless, the TF considered whether currently recommended technology was used in the research studies that were evaluated. In particular, the TF determined that the use of currently AASM recommended flow (nasal pressure transducer and thermistor) and effort sensors (respiratory inductance plethysmography) during PSG and HSAT increased the value of evidence derived from validation studies.24 As part of the data extraction process, validation studies were classified based on whether the currently recommended respiratory sensors were used for PSG or HSAT.
METHODS
Expert Task Force
The AASM commissioned a TF of board-certified sleep medicine physicians, with expertise in the diagnosis and management of adults with OSA, to develop this guideline. The TF was required to disclose all potential conflicts of interest (COI) according to the AASM's COI policy, both prior to being appointed to the TF, and throughout the research and writing of this paper. In accordance with the AASM's conflicts of interest policy, TF members with a Level 1 conflict were not allowed to participate. TF members with a Level 2 conflict were required to recuse themselves from any related discussion or writing responsibilities. All relevant conflicts of interest are listed in the Disclosures section.
PICO Questions
A PICO (Patient, Population or Prob lem, Intervention, Comparison, and Outcomes) question template was used to develop clinical questions to be addressed in this guideline. PICO questions were developed based on a review of the existing AASM practice parameters on indications for use of PSG and HSAT for the diagnosis of patients with OSA, and a review of systematic reviews, meta-analyses, and guidelines published since 2004. The AASM Board of Directors (BOD) approved the final list of PICO questions presented in Table 1 before the literature search was performed. The PICO questions identify the commonly used approaches and devices for the diagnosis of OSA. Based on their expertise, the TF developed a list of patient-oriented clinically relevant outcomes that are indicative of whether a treatment should be recommended for clinical practice. A summary of the critical outcomes for each PICO is presented in Table 2. Lastly, clinical significance thresholds, used to determine if a change in an outcome was clinically significant, were defined for each outcome by TF clinical judgment, prior to statistical analysis. The clinical significance thresholds are presented by outcome in Table 3. It should be noted that there was insufficient evidence to directly address PICO question 1, as no studies were identified that compared the efficacy of clinical prediction algorithms to history and physical exam. However, the TF decided to compare the efficacy of clinical prediction algorithms to PSG and HSAT.
Table 1
Table 2
Table 3
Literature Searches, Evidence Review and Data Extraction
The TF performed a systematic review of the scientific literature to identify articles that addressed at least one of the nine PICO questions. Multiple literature searches were performed by AASM staff using the PubMed and Embase databases, throughout the guideline development process (see Figure 1). The search yielded articles with various study designs, however the analysis was limited to randomized controlled trials (RCTs) and observational studies. The articles that were cited in the 2007 AASM clinical practice guideline,2 2005 practice parameter,1 2003 review,29 and 1997 review30 were included for data analysis if they met the study inclusion criteria described below.
The literature searches in PubMed were conducted using a combination of MeSH terms and keywords as presented in the supplemental material. The PubMed database was searched from January 1, 2005 through July 26, 2012 for any relevant literature published since the last guideline. The PubMed search was expanded on September 26, 2012 to identify relevant articles published prior to January 1, 2005. Literature searches also were also performed in Embase using a combination of terms and keywords as presented in the supplemental material. The Embase database was searched from January 1, 2005 through September 13, 2012. These searches yielded a total of 3,937 articles. There were 205 duplicates identified resulting in a total of 3,732 articles from both databases.
A second round of literature searches was performed in PubMed and Embase to capture more recent literature. The PubMed database was searched from July 27, 2012 to December 23, 2013, and the Embase database was searched from September 13, 2012 to December 23, 2013. These searches yielded a total of 2,061 articles. There were 670 duplicates identified resulting in 1,391 additional papers from both databases.
A final literature search was performed in PubMed to capture the latest literature. The PubMed database was searched from December 24, 2013 to June 29, 2016 and identified 2,129 articles.
Based on their expertise and familiarity with the literature, TF members submitted additional relevant literature and screened reference lists to identify articles of potential interest. This served as a “spot check” for the literature searches to ensure that important articles were not overlooked and identified an additional 140 publications.
A total of 7,392 abstracts were assessed by two reviewers to deter mine whether they met inclusion criteria presented in the supplemental material. Articles were excluded per the criteria listed in the supplemental material and Figure 1. A total of 98 articles were included in evidence base for recommendations. A total of 86 studies were included in meta-analysis and/or grading.
Meta-Analysis
Meta-analysis was performed on both diagnostic and clinical outcomes of interest for each PICO question, when possible. Outcomes data for diagnostic approaches were categorized as follows: clinical tools, questionnaires, and prediction algorithms; history and physical exam; HSAT; attended PSG; split-night attended PSG; two-night attended PSG; single-night HSAT; multiple-night HSAT; follow-up attended PSG; and follow-up HSAT. The type of HSAT devices identified in literature search included type 2; type 3; 2–3 channel; single channel; oximetry; and PAT. A definition of these devices has been previously described.31 Adult patients were categorized as follows: suspected OSA; suspected OSA with comorbid conditions; diagnosed OSA; and scheduled for upper airway surgery.
For diagnostic outcomes, the pretest probability for OSA (i.e., the prevalence within the study population), sensitivity and specificity of the tested diagnostic approach, and number of patients for each study was used to derive two-by-two tables (i.e., the number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) diagnoses per 1,000 patients) in both high risk and low risk patients, for each OSA severity threshold (i.e., AHI ≥ 5, AHI ≥ 15, AHI ≥ 30). For analyses that included five or more studies, pooled estimates of sensitivity, specificity, and accuracy were calculated using hierarchical random effects modeling performed in STATA software (accuracy was derived by HSROC curves). When analyses included fewer than five studies, ranges of sensitivity, specificity and accuracy were used. Based on their clinical expertise and a review of available literature, the TF established estimates of OSA prevalence among “low risk” and “high risk” patients for each OSA severity threshold. The TF envisioned a sleep clinic cohort of middle-aged obese men with typical symptoms of OSA as an example of a high-risk patient population. In contrast, a sleep clinic cohort of younger non-obese women with possible OSA symptoms was used as prototype for a low risk patient population. Prevalence estimates for these populations are presented in Table 4.
Table 4
The sensitivity and specificity of included studies were entered into Review Manager 5.3 software to generate forest plots for each analysis. The estimates of sensitivity and specificity (pooled or ranges), and OSA prevalence were entered into the GRADE (Grading of Recommendations Assessment, Development and Evaluation) Guideline Development Tool (GDT) to generate the two-by-two tables. The TF determined the downstream consequences of an accurate diagnosis versus an inaccurate diagnosis (see supplemental material, Table S1), and used the estimates to weigh the benefits of an accurate diagnosis versus the harms of an inaccurate diagnosis. This information was used, in part, to assess whether a given diagnostic approach could be recommended when compared against PSG.
For clinical outcomes of interest, data on change scores were entered into the Review Manager 5.3 software to derive the mean difference and standard deviation between the experimental diagnostic approach and the gold standard or comparator. For studies that did not report change scores, data from posttreatment values taken from the last treatment time-point were used for meta-analysis. All meta-analyses of clinical outcomes were performed using the random effects model with results displayed as a forest plot. There was insufficient evidence to perform meta-analyses for PICOs 3 and 9, thus no recommendations are provided.
Interpretation of clinical significance for the clinical outcomes of interest was conducted by comparing the absolute effects to the clinical significance threshold previously determined by the TF for each clinical outcome of interest (see Table 3).
Strength of Recommendations
The assessment of evidence quality was performed according to the GRADE process.32 The TF assessed the following four components to determine the direction and strength of a recommendation: quality of evidence, balance of beneficial and harmful effects, patient values and preferences and resource use as described below.
Quality of evidence: based on an assessment of the overall risk of bias (randomization, blinding, allocation concealment, selective reporting, and author disclosures), imprecision (clinical significance thresholds), inconsistency (I2 cutoff of 75%), indirectness (study population), and risk of publication bias (funding sources), the TF determined their overall confidence that the estimated effect found in the body of evidence was representative of the true treatment effect that patients would see. For diagnostic accuracy studies, the QUADAS-2 tool was used in addition to the quality domains for the assessment of risk of bias in intervention studies. The quality of evidence was based on the outcomes that the TF deemed critical for decision-making.
Benefits versus harms: based on the meta-analysis (if applicable), analysis of any harms or side effects reported within the accepted literature, and the clinical expertise of the TF, the TF determined if the beneficial outcomes of the intervention outweighed any harmful side effects.
Patient values and preferences: based on the clinical expertise of the TF members and any data published on the topic relevant to patient preferences, the TF determined if patients would use the intervention based on the body of evidence, and if patient values and preferences would be generally consistent.
Resource use: based on the clinical expertise of the TF members and a “spot check” for relevant literature the TF determined resource use to be important for determining whether to recommend the use of HSAT versus PSG, split-night versus full-night PSG and single-night versus multiple-night HSAT diagnostic protocols, and repeat testing. Resource use was not considered in-depth for clinical tools, questionnaires and prediction algorithms, diagnosis in adults with comorbid conditions, and repeat PSG.
Taking these major factors into consideration, each recommendation statement was assigned strength (“STRONG” or “WEAK”). Additional information is provided in the form of “Remarks” immediately following the recommendation statements, when deemed necessary by the TF. Remarks are based on the evidence evaluated during the systematic review, are intended to provide context for the recommendations, and to guide clinicians in implementing the recommendations in daily practice.
Discussions accompany each recommendation to summarize the relevant evidence and explain the rationale leading to each recommendation. These sections are an integral part of the GRADE system and offer transparency to the process.
Approval and Interpretation of Recommendations
A draft of the guideline was available for public comment for a two-week period on the AASM website. The TF took into consideration all the comments received and made revisions when appropriate. The revised guideline was submitted to the AASM BOD who approved these recommendations.
The recommendations in this guideline define principles of practice that should meet the needs of most patients in most situations. This guideline should not, however, be considered inclusive of all proper methods of care or exclusive of other methods of care reasonably used to obtain the same results. A STRONG recommendation is one that clinicians should, under most circumstances, always follow (i.e., something that might qualify as a Quality Measure). A WEAK recommendation reflects a lower degree of certainty in the appropriateness of the patient-care strategy and requires that the clinician use his/her clinical knowledge and experience, and refer to the individual patient's values and preferences to determine the best course of action. The ultimate judgment regarding the suitability of any specific recommendation must be made by the clinician, in light of the individual circumstances presented by the patient, the available diagnostic tools, the accessible treatment options, and available resources.
The AASM expects this guideline to have an impact on professional behavior, patient outcomes, and possibly, health care costs. This clinical practice guideline reflects the state of knowledge at the time of the literature review and will be reexamined and updated as new information becomes available.
CLINICAL PR ACTICE RECOMMENDATIONS
The following clinical practice recommendations are based on a systematic review and evaluation of evidence following the GRADE methodology. Remarks are provided to guide clinicians in the implementation of these recommendations. All figures, including meta-analyses and Summary of Findings tables are presented in the supplemental material. Table 5 shows a summary of the recommendation statements including the strength of recommendation and quality of evidence. A decision tree for the diagnosis of patients suspected of having OSA is presented in Figure 2.
Table 5
The following are good practice statements, the implementation of which is deemed necessary for appropriate and effective diagnosis and management of OSA.
Diagnostic testing for OSA should be performed in conjunction with a comprehensive sleep evaluation and adequate follow-up.
OSA is one of many medical conditions that may be the cause of sleep complaints and other symptoms. Therefore, diagnostic testing for OSA is best carried out after a comprehensive sleep evaluation. The clinical evaluation for OSA should include a thorough sleep history and a physical examination that includes the respiratory, cardiovascular, and neurologic systems. The examiner should pay particular attention to observations regarding snoring, witnessed apneas, nocturnal choking or gasping, restlessness, and excessive sleepiness. It is also important that other aspects of a sleep history be collected, as many patients suffer from more than one sleep disorder or present with atypical sleep apnea symptoms. In addition, medical conditions associated with increased risk for OSA, such as obesity, hypertension, stroke, and congestive heart failure should be identified. The general evaluation should serve to establish a differential diagnosis, which can then be used to select the appropriate test(s). Follow-up, under the supervision of a board-certified sleep medicine physician, ensures that study findings and recommendations are relayed appropriately; and that appropriate expertise in prescribing and administering therapy is available to the patient.
The TF recognizes that there may be specific contexts (e.g., preoperative evaluation of OSA) in which evaluation of OSA needs to occur in an expedited manner, when it may not be practical to perform a comprehensive sleep evaluation prior to diagnostic testing. In such situations, the TF recommends a clinical pathway be developed and administered by a board-certified sleep medicine physician or appropriately licensed medical staff member designated by the board-certified sleep medicine physician. This pathway should include the following elements: a focused evaluation of sleep apnea performed by a clinical provider, and use of tools or questionnaires that capture clinically important information that is reviewed by a board-certified sleep medicine physician prior to testing. Following testing, a comprehensive sleep evaluation and follow-up under the supervision of a board-certified sleep medicine physician should be completed.
Polysomnography is the standard diagnostic test for the diagnosis of OSA in adult patients in whom there is a concern for OSA based on a comprehensive sleep evaluation.
Misdiagnosing patients can lead to significant harm due to lost benefits of therapy in those with OSA, and the prescription of inappropriate therapy in those without OSA. As discussed in the recommendations below, sleep apnea-focused questionnaires and clinical prediction rules lack sufficient diagnostic accuracy, and therefore direct measurement of SDB is necessary to establish a diagnosis of OSA. PSG is widely accepted as the gold standard test for diagnosis of OSA. Further, this test has traditionally been used as the gold standard for comparison to other diagnostic tests, including HSAT. Besides the diagnosis of OSA, PSG can identify co-existing sleep disorders, including other forms of sleep-disordered breathing. In some cases, and within the appropriate context, the use of HSAT as the initial sleep study may be acceptable, as discussed in the recommendations below. However, PSG should be used when HSAT results do not provide satisfactory posttest probability of confirming or ruling out OSA.
Diagnosis of Obstructive Sleep Apnea in Adults Using Clinical Tools, Questionnaires and Prediction Algorithms
Recommendation 1: We recommend that clinical tools, questionnaires and prediction algorithms not be used to diagnose OSA in adults, in the absence of polysomnography or home sleep apnea testing. (STRONG)
Summary
The literature search did not identify publications that directly compared the performance of clinical prediction algorithms to history and physical exam to accurately identify patients with OSA. However, our review identified forty-eight studies that compared the accuracy of clinical tools, questionnaires or prediction algorithms against PSG or HSAT. In the clinic-based setting, clinical tools, questionnaires and prediction algorithms have a low level of accuracy for the diagnosis of OSA at any threshold of AHI consideration. The overall quality of evidence was downgraded to moderate due to inconsistency and imprecision of findings.
Clinical prediction algorithms may be used in sleep clinic patients with suspected OSA, but are not necessary to establish the need for PSG or HSAT and further are not sufficient to substitute for PSG or HSAT. In non-sleep clinic settings, these tools may be more helpful to identify patients who are at increased risk for OSA, but this was beyond the scope of this guideline.
Evaluation with a clinical tool, questionnaire or prediction algorithm may be less burdensome to patients and clinicians than HSAT or PSG; however, their low levels of accuracy make them poor diagnostic tools. Therefore, based on clinical judgment, the TF determined that the harms of using clinical tools, questionnaires, and prediction algorithms to confirm a suspected diagnosis of OSA outweigh the potential benefits. The TF also determined that a vast majority of patients would not favor the use of clinical questionnaires or prediction tools alone to establish the diagnosis of OSA.
Discussion
While the literature search did not identify publications that directly compared the performance of clinical prediction algorithms to history and physical exam, it did identify forty-eight validation studies that compared the accuracy of clinical tools, questionnaires or prediction algorithms against PSG or HSAT. Our recommendations are therefore based on these validation studies, which are described. Relevant outcomes data from these validation studies are summarized in the supplemental material, Table S2 through Table S36. Due to the uncertainty regarding clinical outcomes for patients misclassified by the prediction rules, the TF was unable to establish a cutoff for number of misclassified patients that would be considered acceptable. Nevertheless, all the clinical prediction models evaluated resulted in upper ranges of predicted false negatives per 1,000 patients that exceeded 100, a number that was determined by the TF to be clearly excessive for a stand-alone diagnostic test for OSA. In summary, the diagnostic performance of clinical questionnaires, morphometric models, and clinical prediction rules that consider multiple variables including symptoms, exam findings and subject characteristics, have all been evaluated against PSG and/ or HSAT. Overall, while sensitivity appears to be in the higher range it is not sufficient to adequately exclude the possibility of OSA. Specificity tends to be lower, resulting in a higher number of false positives that further limit the utility of these clinical or morphometric rules and models in the diagnosis of OSA. It should also be noted that some of these studies were conducted in focused populations (e.g., commercial drivers, elderly, bariatric surgery patients, etc.), thus limiting generalizability. The following discussion has been organized to review the data by questionnaire or clinical prediction rule.
BERLIN QUESTIONNAIRE: The Berlin Questionnaire consists of eleven questions divided into three categories to classify the patient as high or low risk for OSA.33 Our review identified nineteen studies that evaluated the performance of the Berlin Questionnaire against PSG in the identification of patients with OSA.34–52 The studies were conducted in a wide variety of geographic locations including Brazil,38 Canada,34,42 Greece,37 Iran,36 Korea,40 Turkey,43 and the United States.41,44 Various patient populations were considered, including those in primary care clinics, sleep clinics, the veteran population, and patients with cardiac disease. The patients included in these studies were mostly men (50% or greater in the majority of studies) with suspected OSA; they were overweight or obese, and middle-aged. Overall, the Berlin Questionnaire produced a large number of false negative results, thereby limiting its utility as an instrument to diagnose patients with OSA. Specifically, when assessing the performance of the Berlin Questionnaire in identification of subjects with an AHI cutoff of ≥ 5, the pooled sensitivity was 0.76 (95% CI: 0.72 to 0.80), while the pooled specificity was 0.45 (95% CI: 0.34 to 0.56) (see supplemental material, Figure S1). Assuming a prevalence of 87% in a high-risk population, the result was an unacceptably high number of false negative results of 209 per 1,000 patients (95% CI: 174 to 244) (see supplemental material, Table S2). Furthermore, the questionnaire had suboptimal accuracy, ranging from 56% to 70%; accuracy became progressively more compromised with consideration of higher OSA severity thresholds (see supplemental material, Table S2 through Table S4 and Figure S1 through Figure S6).
Five studies evaluated the performance of the Berlin Questionnaire against HSATs.53–57 When using an AHI cutoff of ≥ 15, the pooled estimate for sensitivity was 0.76 (95% CI: 0.64 to 0.85); specificity was 0.44 (95% CI: 0.30 to 0.58) and accuracy was 67%. When using an AHI cutoff of ≥ 5 and assuming a prevalence of 87% in a high-risk population, the number of false negative results was 531 per 1,000 patients (95% CI: 357 to 679) (see supplemental material, Figure S5 through Figure S7, Table S5 and Table S6).
The quality of evidence for the use of the Berlin Questionnaire was low after being downgraded due to either heterogeneity, indirectness, or imprecision.
EPWORTH SLEEPINESS SCALE: The Epworth Sleepiness Scale (ESS) is a self-reported questionnaire involving eight questions to assess the propensity for daytime sleepiness or dozing.58 Our review identified seven studies that evaluated the performance of the ESS against PSG in the identification of patients with OSA. These studies were conducted in China, Brazil, Croatia, Turkey and the United States, thus reflecting a wide geographic sampling.38,43,50,51,59–61 Participants were those suspected of OSA and included mainly male, middle-aged and overweight or obese individuals. The overall results indicate that the ESS had a large number of false negative results limiting its utility for the diagnosis of OSA. When considering an AHI of ≥ 5, the ESS revealed a range of sensitivity of 0.27–0.72 and specificity of 0.50–0.76 (see supplemental material, Table S8). The ESS demonstrated an accuracy ranging from 51% to 59% for the AHI ≥ 5 cutoff. Therefore, the ESS had a high number of false negative results (range of 244 to 635 per 1,000 patients; assuming a prevalence of 87%). When considering a cutoff of AHI ≥ 15 and assuming a prevalence of 64% in high-risk patients, the number of false negative results increased and ranged from 269 to 506 per 1,000 patients (see supplemental material, Table S9). Findings from one study, comparing the performance of the ESS against HSATs for identification of patients with OSA, showed low sensitivity of 0.36 (95% CI: 0.19 to 0.57) and higher specificity of 0.77 (95% CI: 0.66 to 0.86)57 (see supplemental material, Table S11).
The quality of evidence for the use of the ESS ranged from low to high across different AHI cutoffs after being downgraded to due to heterogeneity, indirectness, or imprecision The TF determined that the overall quality of evidence across AHI cutoffs was low.
STOP-BANG QUESTIONNAIRE: The STOP-BANG questionnaire is an OSA screening tool consisting of four yes/no questions and four clinical attributes.62 We identified ten studies, involving primarily middle-aged, obese males suspected of OSA that evaluated the performance of STOP-BANG questionnaire against PSG in the identification patients with OSA.42,49–52,63–67 The overall findings reveal that the STOP-BANG questionnaire had high sensitivity, but low specificity for the detection of OSA. These findings became more pronounced when higher levels of OSA category cutoffs were considered. The number of potential false negative diagnostic results limits use of the STOP-BANG as an instrument to diagnose individual patients with OSA. Specifically, when considering an AHI ≥ 5 and assuming a prevalence of 87% in high-risk patients, the sensitivity in the studies was 0.93 (95% CI: 0.90 to 0.95), but specificity was 0.36 (95% CI: 0.29 to 0.44) with a range of accuracy of 52 to 53%. The number of false negatives when compared against PSG was 61 per 1,000 patients (95% CI: 43 to 87), assuming a prevalence of 87% (see supplemental material, Figure S8 and Table S12). The sensitivity further improved and specificity was further compromised when progressively higher level of AHI cutoffs were considered (see supplemental material, Figure S10 through Figure S12, Table S13 and Table S14). The sensitivity and specificity of the STOP-BANG was similar when compared against HSAT55,68 (see supplemental material, Table S15 through Table S17), or against PSG or HSAT62,68 (see supplemental material, Tables S18 through Table S20).
The quality of evidence for the use of the STOP-BANG questionnaire ranged from low to high across different AHI cutoffs was after being downgraded due to either indirectness, inconsistency, or imprecision. The TF determined that the overall quality of evidence across AHI cutoffs was moderate.
STOP QUESTIONNAIRE: Our review identified five studies that evaluated the diagnostic performance of the STOP questionnaire against PSG.49–51,67,69 The STOP questionnaire showed moderate to high sensitivity, low specificity, and moderate accuracy (see supplemental material, Figure S14 and Table S21 through Table S23). When considering an AHI ≥ 5, the sensitivity was 0.88 (95% CI: 0.77 to 0.94), the specificity was 0.33 (95% CI: 0.18 to 0.52), and the accuracy in a high-risk population ranged from 74% to 86%. Assuming a prevalence of 87%, the number of false negatives was 104 per 1,000 patients (95% CI: 52 to 200) (see supplemental material, Table S21). When considering an AHI cutoff of ≥ 15, the sensitivity ranged from 0.62–0.98, the specificity ranged from 0.10–0.63, and the accuracy in a high-risk population ranged from 60% to 79%. Assuming a prevalence of 64% in a high-risk population, the number of false negatives ranged from 13 to 243 per 1,000 patients (see supplemental material, Table S22). When considering an AHI cutoff of ≥ 30, the sensitivity ranged from 0.91–0.97, the specificity ranged from 0.11–0.36, and the accuracy in a high-risk population ranged from 48% to 49%. Assuming a prevalence of 36% in a high-risk population, the number of false negatives ranged from 11 to 32 per 1,000 patients (see supplemental material, Table S23).
The quality of evidence for the use of the STOP questionnaire ranged from low to moderate across different diagnostic cutoffs and risk groups after being downgraded due to heterogeneity or imprecision. The TF determined that the overall quality of evidence across AHI cutoffs was low.
MORPHOMETRIC MODELS: Our review identified two studies that used morphometric models to predict OSA that was confirmed using sleep study data.70,71 In a group of hypertensive patients, a multivariable apnea prediction score that combined symptoms, body mass index, age and sex was used to assess OSA risk.70 In another study involving primarily middle-aged males, those with OSA were compared to those without OSA by using a morphometric clinical prediction formula incorporating measures of craniofacial anatomy (e.g., palatal height, maxillary and mandibular intermolar distances).71 While these studies demonstrate relatively high sensitivity (range of 0.88– 0.98) to predict AHI ≥ 5, the specificity was quite low (range of 0.11–0.31) (see supplemental material, Table S24). When considering adjusted neck circumference in both the hypertensive and chronic kidney disease populations, there are similar findings of relatively high sensitivity, but poor specificity, with improvements in specificity using higher AHI cutoffs55,70 (see supplemental material, Table S25 and Table S26).
The quality of evidence for the use of morphometric models and adjusted neck circumference was moderate after being downgraded due to imprecision.
MULTIVARIABLE APNEA PREDICTION QUESTIONNAIRE: The performance of the Multivariable Apnea Prediction (MAP) questionnaire has been evaluated against PSG in those with suspected OSA,35,72–74 a sample of hypertensive patients,70 and also a sample of older adults75 with findings of lower levels of specificity and high numbers of false positive results (see supplemental material, Table S27 and Table S28).
The quality of evidence for the use of the MAP questionnaire was judged moderate; it was downgraded due to imprecision.
CLINICAL PREDICTION MODELS: Four studies evaluated the performance of clinical prediction models against PSG,61,76–78 and three studies75,79,80 evaluated these models against HSAT. Two of the studies compared respiratory parameters against PSG: a study involving a Chinese cohort that evaluated snoring while sitting76 and another single study assessing respiratory conductance and oximetry.78 Results demonstrated a sensitivity ranging from 0.33–0.90, and a specificity ranging from 0.50– 1.00 using an AHI cutoff of ≥ 5. Other studies compared clinical prediction rules including age, waist circumference, ESS score and minimum oxygen saturation, and another evaluated gender, nocturnal choking, snoring and body mass index against PSG; these reported reasonably high sensitivity (range of 0.72–0.94) and specificity (range of 0.75–0.91) considering different AHI thresholds.61,77 Clinical prediction rules have been evaluated against HSAT in select populations, i.e., the elderly,75 bariatric surgery candidates,79 and commercial drivers.80 These studies reported sensitivities ranging from 0.76–0.97 and specificities ranging from 0.19–0.75 using an AHI cutoff of ≥ 3075,79,80 (see supplemental material, Table S29 through Table S31).
The quality of evidence for the use of clinical prediction models ranged from moderate to high across the different AHI cutoffs after being downgraded due to imprecision. The TF determined that the overall quality of evidence across AHI cutoffs was moderate.
OTHER OSA PREDICTION TOOLS: Our literature review identified other OSA prediction tools, including the OSA50, the clinical decision support system, the OSAS score, and the Kushida Index. The OSA50 questionnaire involves four components including age ≥ 50, snoring, witnessed apneas and waist circumference.81 A study involving Turkish bus drivers82 and a validation study for the OSA50 in the primary care setting81 showed a sensitivity ranging from 0.49–0.98 and a specificity of 0.82 in both studies (see supplemental material, Table S32 and Table S33).
The performance of a hand-held clinical decision support system (assessing sleep behavior, breathing during sleep and daytime functioning) against PSG was evaluated in a study of veterans with ischemic heart disease. The system showed a high sensitivity of 0.98 (95% CI: 0.92 to 1.00) and a high specificity of 0.87 (95% CI: 0.66 to 0.97)41 (see supplemental material, Table S34).
The OSAS score involves assessment of the Friedman tongue position, tonsil size, and body mass index. In a sample of individuals suspected to have OSA, the sensitivity of the OSAS score was 0.86 (95% CI: 0.80 to 0.91) against PSG at an AHI > 5 cut = off, however; specificity was lower at 0.47 (95% CI: 0.34 to 0.56) with a high number of false positives in the low-risk group39 (see supplemental material, Table S35).
One study evaluating the performance of the Kushida Index against PSG showed a high sensitivity of 0.98 (95% CI: 0.95 to 0.99) and high specificity of 1.00 (95% CI: 0.92 to 1.00) to detect AHI ≥ 5 (see supplemental material, Table S36).
The quality of the evidence for other prediction tools ranged from low to high across different tools, diagnostic cutoffs, and risk groups after being downgraded due to imprecision and indirectness.
OVERALL QUALITY OF EVIDENCE: The quality of evidence for specific clinical tools, questionnaires and prediction algorithms ranged from very low to high after being downgraded due to imprecision, indirectness, and heterogeneity. However, due to the heterogeneity of the tools, questionnaires and prediction algorithms described above combined with the low likelihood that future research would result in a change of the accuracy of these tools, the TF determined that the overall quality of evidence for the recommendation against using clinical tools, questionnaires or predictive tools was moderate.
BENEFITS VERSUS HARMS: These clinical tools, questionnaires and prediction algorithms carry the risk of not capturing the diagnosis of OSA when indeed OSA is present. Given the downstream effects of false negative diagnostic results, this would translate into high levels of OSA-related decrements in QOL, morbidity, and mortality due to undiagnosed and untreated OSA. On the other hand, false positive results would result in unnecessary testing and treatment for sleep apnea. Therefore, the TF determined that the potential harms outweigh the potential benefits of using clinical tools, questionnaires and prediction algorithms alone to diagnose OSA.
PATIENTS' VALUES AND PREFERENCES: Evaluation with clinical tools, questionnaires or prediction algorithms may be less burdensome to the patient and physician, when compared to HSAT or PSG. However, this must be weighed against their low levels of accuracy and the likelihood of misdiagnosis. In contrast, PSG and HSAT require more resources and create more burden for the patient and physician; however, they provide greater value in terms of higher diagnostic accuracy and therefore a higher likelihood that patients will receive appropriate treatment. Based on its clinical judgment, the TF determined that the vast majority of patients would not favor the use of clinical questionnaires or prediction tools alone for the diagnosis of OSA.
Home Sleep Apnea Testing for the Diagnosis of Obstructive Sleep Apnea in Adults
Recommendation 2: We recommend that polysomnography, or home sleep apnea testing with a technically adequate device, be used for the diagnosis of OSA in un-complicated adult patients presenting with signs and symptoms that indicate an increased risk of moderate to severe OSA. (STRONG)
Recommendation 3: We recommend that if a single home sleep apnea test is negative, inconclusive or technically inadequate, polysomnography be performed for the diagnosis of OSA. (STRONG)
Remarks: The following remarks are based on specifications used by studies that support these recommendation statements:
An uncomplicated patient is defined by the absence of:
Conditions that place the patient at increased risk of non-obstructive sleep-disordered breathing (e.g., central sleep apnea, hypoventilation and sleep related hypoxemia). Examples of these conditions include significant cardiopulmonary disease, potential respiratory muscle weakness due to neuromuscular conditions, history of stroke and chronic opiate medication use.
Concern for significant non-respiratory sleep disorder(s) that require evaluation (e.g., disorders of central hypersomnolence, parasomnias, sleep related movement disorders) or interfere with accuracy of HSAT (e.g., severe insomnia).
Environmental or personal factors that preclude the adequate acquisition and interpretation of data from HSAT.
An increased risk of moderate to severe OSA is indicated by the presence of excessive daytime sleepiness and at least two of the following three criteria: habitual loud snoring, witnessed apnea or gasping or choking, or diagnosed hypertension.
HSAT is to be administered by an accredited sleep center under the supervision of a board-certified sleep medicine physician, or a board-eligible sleep medicine provider.
A single HSAT recording is conducted over at least one night.
A technically adequate HSAT device incorporates a minimum of the following sensors: nasal pressure, chest and abdominal respiratory inductance plethysmography, and oximetry; or else PAT with oximetry and actigraphy. For additional information regarding HSAT sensor requirements, refer to The AASM Manual for the Scoring of Sleep and Associated Events.24
A technically adequate diagnostic test includes a minimum of 4 hours of technically adequate oximetry and flow data, obtained during a recording attempt that encompasses the habitual sleep period.
Summary
Twenty-six validation studies suggested potential for clinically significant diagnostic misclassification using HSAT when compared against PSG. However, seven RCTs failed to find, after PAP initiation, that patient-reported sleepiness, QOL, and continuous positive airway pressure (CPAP) adherence were significantly different when HSAT was used. The RCTs used HSAT in the context of a management pathway that required PSG confirmation for patients in whom HSAT did not establish an OSA diagnosis, under the conditions specified in the above remarks. The overall quality of evidence was moderate due to imprecision, inconsistency, or indirectness. Therefore, in this context, either PSG or HSAT is recommended for the diagnosis of OSA. However, two other considerations are also key. First, a clinician's choice of study type for a particular patient should be guided by clinical judgment and incorporate consideration of patient preferences. Second, it is essential to note that the need for diagnosis of OSA is not limited to uncomplicated patients who are at increased risk for moderate to severe OSA. In patients who do not meet these criteria, but in whom there is a concern for OSA based on a comprehensive sleep evaluation, PSG is recommended.
HSAT is less sensitive than PSG in detection of OSA and a false negative test could result in harm to the patient due to denial of a beneficial therapy. For this reason, the majority of RCTs that were judged most generalizable to clinical practice required that PSG eventually be performed when HSAT did not confirm a diagnosis of OSA.83–85 Performing a repeat HSAT is not recommended when an initial test is negative, inconclusive or technically inadequate, due to the higher likelihood that a second test will also be negative, inconclusive or technically inadequate. There is also an increased risk that the patient will not complete the diagnostic process prior to a definitive diagnosis. Therefore, after a single negative, inconclusive or technically inadequate HSAT result, performance of a PSG is strongly recommended.
Finally, use of HSAT to diagnose OSA has been shown to provide adequate clinical outcomes and efficiency of care when performed with adequate clinical and technical expertise, using specific types of HSAT devices, in an appropriate patient population, and within an appropriate management pathway. Use of HSAT in other contexts may not provide similar benefit, and therefore the recommendations for the use of HSAT are limited. On the other hand, unstudied or understudied contexts could exist in which HSAT may provide benefit to a patient.
The TF determined that the benefits of HSAT compared to PSG balanced the potential harms, when used in the patient populations and under the conditions specified in the above remarks and recommendations. Based on clinical judgment, the TF determined that many patients would value the convenience and potential cost savings of HSAT, while other patients would prefer the superior accuracy of PSG, the increased probability that only one diagnostic test will be needed, and the potential ability to titrate positive airway pressure if indicated.
Discussion
The formulation of these recommendation statements was guided by evidence from twenty-six validation studies that evaluated the diagnostic accuracy of HSAT against PSG,35,53,62,67,81,86–106 as well as seven RCTs that compared clinical outcomes from management pathways.83–85,107–110 Four of these RCTs were determined to be most relevant to clinical practice, as they did not require oximetry testing as a criterion for inclusion and used conventional methods for determination of PAP pressures (i.e., APAP or attended titration).83–85,110 This subset of studies will be referred to as “RCTs most generalizable to clinical practice” for the remainder of this discussion section.
ACCURACY: The following paragraphs are organized by type of HSAT device and components or combinations of components, as described in the literature.
A total of twenty-six validation studies were identified that reported accuracy outcomes. The data from these validation studies are summarized in the supplemental material, Table S37 through Table S58. In two studies that evaluated the performance of Type 2 HSAT devices against PSG,67,86 when using an AHI ≥ 5 cutoff, accuracy in a high-risk population (assuming a prevalence of 87%) ranged from 84% to 91%. Using a cutoff of AHI ≥ 15, the accuracy of these devices was 88% in a high-risk group (see supplemental material, Table S37 and Table S38).67,86 Seven studies evaluated the performance of Type 3 HSAT devices against PSG, but the AHI cutoffs employed varied across studies, resulting in sub-grouping by AHI cutoffs for our analyses.87–93 When using an AHI ≥ 5 cutoff, accuracy in a high-risk population (assuming a prevalence of 87%) ranged from 84% to 91%, whereas in a low-risk population (assuming a prevalence of 55%) accuracy ranged from 70% to 78% based on the seven studies (see supplemental material, Table S39). Using a cutoff of AHI ≥ 15, the accuracy of these devices in a high-risk population ranged from 65% to 91%, based on six studies87,89–92,94 (see supplemental material, Table S40). Using a cutoff of AHI ≥ 30, the accuracy of the devices in the high-risk population was 88% (95% CI: 81% to 94%), based on five studies (see supplemental material, Table S41).
Five studies evaluated the performance of 2–3 channel HSAT devices against PSG. In a high-risk population using cutoffs of AHI ≥ 5,95–97 AHI ≥ 15,95–99 and AHI ≥ 30,96,97 accuracy ranged from 81% to 93%, 72% to 87%, and 71% to 90%, respectively. Using the same cutoffs in a low-risk population, accuracy ranged from 77% to 88%, 68% to 95%, and 88% to 91%, respectively (see supplemental material, Table S42 through Table S44). When the performance of 2–3 channel HSAT was evaluated against unattended in-home PSG, using a cutoff of AHI ≥ 15, accuracy in a high-risk population was 86% (95% CI: 76% to 93%);53 using a cutoff of AHI ≥ 30, accuracy ranged from 83% to 91% (see supplemental material, Table S45 and Table S46).53,81
Six studies evaluated the performance of single channel HSAT against attended or unattended PSG (see supplemental material, Table S47 through Table S50, and Table S51 through Table S53, respectively).73,100–103,111
A single study evaluated the performance of oximetry against unattended in-home PSG.111 Using a cutoff of AHI ≥ 5, accuracy was 73% (95% CI: 68 to 78%) in a high-risk population, and 79% (95% CI: 74 to 84%) in a low-risk population. Using oximetry to identify OSA at an AHI ≥ 5 cutoff, and assuming a prevalence of 87% in a high-risk population, the findings of the study111 would result in an estimated average of 274 misdiagnosed patients out of 1,000 tested, and 210 misdiagnosed patients out of 1,000 tested in a low-risk group (assuming a prevalence of 55%). Using a cutoff of AHI ≥ 15 and AHI ≥ 30, oximetry has an accuracy of 86% (95% CI: 83 to 91%) and 74% (95% CI: 71 to 76%) in a high-risk population, and an accuracy of 80% (95% CI: 75 to 84%) and 63% (95% CI: 59 to 67%) in a low-risk population, respectively (see supplemental material, Table S51 through Table S53).
A single study evaluated the performance of PAT, oximetry, and actigraphy against simultaneous unattended in-home PSG and reported a sensitivity of 0.88 (95% CI: 0.47 to 1.00), specificity of 0.87 (95% CI: 0.66 to 0.97) and accuracy of 88% (95% CI: 50 to 100%) in high-risk patients using a cutoff of AHI ≥ 5.104 These findings would result in 121 misdiagnosed patients out of 1,000 tested in a high-risk population (based on a prevalence of 87%), and 125 misdiagnosed patients out of 1,000 tested in a low-risk population (based on a prevalence of 55%) (see supplemental material, Table S54).104 Two cross-over studies randomized patients to home-based PAT, and in-laboratory simultaneous PSG and PAT.105,106 For comparison to in-laboratory PSG, only the home-based PAT data were used for this recommendation. A single study that evaluated the performance of the PAT device in the home against in-laboratory PSG using a cutoff of AHI ≥ 5,106 reported a specificity of 0.43 (95% CI: 0.22 to 0.66). When two studies evaluated the home-based PAT device against in-laboratory PSG at an AHI cutoff of ≥ 15, specificity ranged from 0.77 to 1.00 and sensitivity ranged from 0.92 to 0.96.105,106 A single study evaluated the PAT device at an AHI cutoff of ≥ 30, and reported a specificity of 0.82 (95% CI: 0.57 to 0.96) and sensitivity of 0.92 (95% CI: 0.62 to 1.00)105 (see supplemental material, Table S55 through Table S57).
The quality of evidence for diagnostic accuracy was downgraded due to indirectness, imprecision, or inconsistency. The quality ranged from low to high based on different tools and algorithms, diagnostic cutoffs, and risk groups.
The potential consequences for patients classified in true and false positive or negative categories are summarized in the supplemental material, Table S1. The TF concluded that the numbers of patients potentially misclassified by HSAT was high enough to be of clinical concern, particularly when tests were inconclusive or negative. In a population that has increased risk of moderate to severe OSA, both the increased likelihood of false negatives and the significant impact of missed diagnoses on patient outcomes can cause significant harm. This reasoning supports required use of a diagnostic test with higher sensitivity (PSG) in this population if HSAT provides a negative or non-diagnostic result.
CLINICAL OUTCOMES ASSESSMENT: The TF concluded that evaluating the impact of diagnostic accuracy on clinical outcomes is complicated by a number of factors that can cause discordance between tests, including night-to-night variability and inconsistent definitions of respiratory events (e.g., hypopneas) between HSAT and PSG. In addition, there is uncertainty regarding clinical outcomes for patients misclassified by HSAT.
For these reasons, studies that compared clinical outcomes in patients randomized to management pathways based on PSG or HSAT diagnostic assessment, within the same research protocol, provide the best opportunity to assess the acceptability of clinical outcomes using HSAT.
SUBJECTIVE SLEEPINESS: A meta-analysis of seven RCTs compared changes in patient-reported sleepiness, using the ESS, in patients diagnosed by HSAT or PSG, followed by PAP titration (see supplemental material, Figure S18).83–85,107–110 The meta-analysis showed a clinically and statistically insignificant difference of 0.38 points (95% CI: −1.07 to 0.32 points) greater improvement in patients randomized to the HSAT pathway versus the attended PSG pathway. This difference indicates that subjective sleepiness is similarly improved in patients who initiate PAP treatment based on diagnosis using either HSAT or PSG. The quality of evidence for subjective sleepiness was high.
QUALITY OF LIFE: Six RCTs, using various validated instruments (i.e., FOSQ, SAQLI, and SF-36), compared QOL in patients diagnosed by HSAT or PSG, followed by PAP titration.84,85,107–110 Meta-analysis demonstrated differences in pooled effects between pathways that were not significant (see supplemental material, Figure S19 through Figure S23, and Table S58). The quality of evidence ranged from moderate to high based on the measure used to assess QOL. The quality of evidence for the SF-36 physical and mental summary scores was downgraded due to imprecision. The TF considered the overall quality of evidence for QOL to be high as FOSQ and SAQLI measures of QOL were considered more critical for decision-making than the SF-36 measures.
CPAP ADHERENCE: Six RCTs evaluated CPAP adherence (mean hours of use per night); meta-analysis found no significant difference between the two assessment pathways (see supplemental material, Figure S24).83–85,108–110 When determining adherence by number of nights with greater than 4 hours of use, meta-analysis of five RCTs found a clinically insignificant trend towards increased CPAP adherence in the HSAT arm versus the PSG arm (see supplemental material, Figure S25).83–85,107,110 The quality of evidence for CPAP adherence was moderate to high across different AHI cutoffs after being downgraded due to imprecision. The TF determined that the overall quality of evidence across AHI cutoffs was high.
FAILURE TO COMPLETE DIAGNOSTIC ALGORITHM: Among the four RCTs most generalizable to clinical practice, three studies83–85 required use of PSG if HSAT was inconclusive (did not provide adequate data or showed a low AHI after 1 or 2 unsuccessful attempts) and after 1 or 2 failed APAP trials (e.g., insufficient use, elevated residual AHI, persistent large leak). Based on data reported by a multicenter RCT there was concern regarding risk of non-completion of diagnostic testing when initial HSAT did not provide a definitive result. Rosen et al. 201284 reported that 30% (10/33) of subjects with technically inadequate HSATs and 16% (14/88) of subjects with low AHI on HSAT failed to proceed per protocol to PSG. There was also evidence indicating reduced effectiveness of repeated HSAT attempts for technical failures: 82% (147/180) of initial HSAT attempts were technically acceptable, whereas only 60% (12/20) of second attempts resulted in a technically acceptable study. Although failure to complete the diagnostic algorithm was not originally considered a critical outcome, the TF ultimately determined that it was critical for decisions regarding follow-up for inconclusive HSAT attempts. The quality of evidence regarding performance of PSG after a single inconclusive HSAT (versus multiple attempts) was low.
OVERALL QUALITY OF EVIDENCE: The TF determined that the critical outcome for diagnostic accuracy assessment was the number of false negative results. The quality of evidence for accuracy was downgraded to moderate due to imprecision, inconsistency, or indirectness. The quality of evidence for the clinical outcomes of sleepiness, quality of life, and CPAP adherence was high. Depression and cardiovascular outcomes were also considered critical outcomes; however, evidence for these outcomes was not available. Therefore, the overall quality of evidence for recommendation 2 is moderate.
In addition to accuracy and clinical outcomes, the TF determined that failure to complete the diagnostic algorithm was a critical outcome for repeat testing after a negative, inconclusive or technically inadequate HSAT. The quality of evidence for performing PSG after a single inconclusive HSAT was determined to be low, as only one study addressed this outcome. Therefore, the overall quality of evidence for recommendation 3 is low.
RESOURCE USE: Though a single night of HSAT is less resource-intensive than a single night of PSG, the relative cost-effectiveness of management pathways that incorporate each of these diagnostic strategies is unclear. Economic analyses have compared the cost-effectiveness of management pathways that incorporate diagnostic strategies using HSAT or PSG.112–114 All have concluded that PSG is the preferred diagnostic strategy from an economic perspective for adults suspected to have moderate to severe OSA. An important factor in these analyses is the favorable cost-effectiveness of OSA treatment in patients with moderate to severe OSA, particularly when longer time horizons are considered. As a result, diagnostic strategies that lead to increased false negatives, and leave patients untreated, or increase false positives, and unnecessarily treat patients, have less favorable cost-effectiveness. It is important to note that these economic analyses are susceptible to error because of imprecision in modelling of management pathways and limitations in the quality of data available to estimate parameters. The impact of errors can be magnified when extrapolated over long time horizons.
Relative cost-effectiveness of management pathways that use HSAT or PSG for diagnosis can be assessed in the context of a RCT, if resource utilization is measured. Among the four RCTs most generalizable to clinical practice,83–85,110 only one provided this information.84 The study reported that in-trial costs were 25% less in the home-arm than the in-laboratory-arm.84 These estimates were based on the Medicare Fee Schedule for the various study procedures, including office visits and diagnostic testing, and take into account the need to repeat studies.84 A subsequent cost minimization analysis of this RCT also considered costs from a provider perspective.115 While provider costs (capital, labor, overhead) were generally less for the home program, this was not true for all modelled scenarios. The provider perspective highlighted the large number of cost components necessary to ensure high quality home-based OSA management, which narrowed the cost difference relative to lab management.
The available studies indicate that the potential cost advantages of HSAT over PSG are not as high as reflected by the cost difference of a single night of testing. Even when HSAT is used in appropriate populations and conditions, additional HSAT and PSG are needed for patients with technically inadequate or inconclusive studies, in order to achieve an accurate diagnosis. In addition, if a home management pathway is used in a manner that results in reduced effectiveness relative to PSG, use of HSAT could in fact be less cost effective than using PSG. Examples of this include use in patient populations with predominantly mild OSA in which there are a higher proportion of negative or indeterminate HSAT results that require follow-up PSG, or use in patients at risk for non-obstructive sleep-related breathing disorders that may not be accurately diagnosed with HSAT. The TF determined that if HSAT is used in the recommended context and management pathway, it would be more cost-effective than if it is used outside this framework.
BENEFITS VERSUS HARMS: Use of HSAT may provide potential benefits to patients with suspected OSA. Such benefits could include convenience, comfort, increased access to testing, and decreased cost. HSAT can be performed in the home environment with fewer attached sensors during sleep. The availability of HSAT for diagnosis may improve access to diagnostic testing in resource-limited settings, or when the patient is unable to leave the home or healthcare setting for testing. In addition, HSAT may be less costly when used appropriately. These benefits must be weighed against the potential for harm. Harms could result from the need for additional diagnostic testing among patients with technically inadequate or inconclusive HSAT findings, or from misdiagnosis and subsequent inappropriate therapy or lack of therapy. As summarized above, the use of HSAT has not been demonstrated to provide inferior clinical benefit, compared to PSG when used in the appropriate context. Therefore, the TF determined that if HSAT is used in the context described in the recommendations and remarks, the risk of harm is minimized and the probability of potential economic benefits increased.
The TF was concerned that, in clinical practice (in contrast to the RCT setting) there would be higher levels of drop out from diagnostic testing, among patients with initial study attempts that did not result in diagnoses of OSA. In particular, there was concern that patients with false negative HSAT results may not complete additional testing after learning of a negative result, despite the presence of symptoms of OSA. In addition, as described above, HSAT is less accurate than PSG and more likely to result in false negative results. For these reasons, the TF recommends that if the initial HSAT shows a negative or inconclusive result, PSG, rather than a second HSAT, should be performed. There are similar concerns that, following a technically inadequate HSAT, repeat HSAT may be associated with a higher rate of technical failure on the second study, and with increased risk of drop out from the diagnostic process. Therefore, the TF also recommends that if the initial HSAT is technically inadequate, PSG rather than a second HSAT should be performed. On the other hand, the TF recognizes that there may be specific circumstances in which repeat HSAT is appropriate after an initial failed HSAT. These circumstances would include cases in which both of the following are present: the clinician determines that there is a high likelihood of successful recording on a second attempt, and the patient expresses a preference for this approach.
The TF recognizes that HSAT may have value to patients in some contexts beyond what is covered by these recommendations, but has limited the recommendations to apply to situations where there is sufficient evidence to guide evaluation of benefits versus harms.
PATIENTS' VALUES AND PREFERENCES: Individual patient preference for PSG or HSAT will differ depending on circumstances and values. In one of the four RCTs most generalizable to clinical practice, both HSAT and PSG were performed for each patient, and 76% preferred HSAT.110 This means that a significant percentage (24%) still preferred PSG. Unfortunately, there is insufficient data about diagnostic testing preferences in clinical practice, where preferences may differ from what is seen in the RCT setting. The availability of different options for diagnosis may increase satisfaction, if patient preferences are included in the process of choosing the diagnostic test type. If HSAT is used, the TF determined that patients would value accurate diagnosis, good clinical outcomes, and increased convenience. Based on their clinical judgment, the TF also determined that patients would prefer not having a repeat HSAT if the initial test result is negative, as repeated HSAT would be less likely to produce a definitive result and would unnecessarily inconvenience the patient. In this situation, proceeding directly to PSG, which has greater sensitivity to detect OSA, would be preferred by most patients. The TF also determined that most patients would prefer not to have a repeat HSAT if the initial test was technically inadequate, to avoid inconvenience, but that some patients may desire this option, in specific cases in which there was high likelihood of an adequate result with repeat testing.
SPECIAL CONSIDERATIONS: The following sections describe special considerations when using HSAT for the diagnosis of OSA. They provide additional support for, and explanation of the Remarks, and are based on specifications used by studies that support the recommendation statements.
CLINICAL POPULATION: A review of RCTs that met inclusion criteria indicated that the following criteria should be used to establish the presence of increased risk of moderate to severe OSA and to determine if HSAT use is reasonable: excessive daytime sleepiness occurring on most days, AND the presence of at least two of the following three criteria: habitual loud snoring; witnessed apnea or gasping or choking; or diagnosed hypertension. Among the four RCTs most generalizable to clinical practice, two of the four studies83,84 required ESS > 12 as an entry criterion: One110 required at least two out of three criteria (i.e., sleepiness (ESS > 10), witnessed apnea, snoring) for participation; and one, which was performed in a Veteran's Administration population, did not specify any specific entry criteria besides suspected OSA (though the average ESS for participants was elevated at > 12 and 95% were men).83 In the latter study, 9.9% of individuals in the PSG arm were found to have AHI < 5.83 In addition to sleepiness, at least two studies in this subset had specific inclusion criteria such as snoring, witnessed apnea, gasping or choking at night, or hypertension.83,85 One study incorporated neck circumference in the determination of high risk of OSA.84
EXCLUDED PATIENT POPULATIONS: Three of the four RCTs most generalizable to clinical practice excluded patients with significant cardiopulmonary disease and other significant sleep disorders.83,84,110 Two studies excluded patients taking opioids, having uncontrolled psychiatric disorder, neuromuscular disease, and patients with significant safety-related issues related to driving or work. Other notable exclusion criteria, specified by at least one of the studies, included lack of an appropriate living situation, pregnancy, and alcohol abuse. The single study that did not mention exclusion criteria noted that 3 of 148 individuals in the HSAT arm were diagnosed with CSA and 4 of 148 individuals required supplemental oxygen or bi-level PAP and exited the study.85 In the PSG arm of the study, 6 of 148 individuals were diagnosed with CSA and 12 of 148 required supplemental oxygen or bi-level PAP. Studies outside the four RCTs most generalizable to clinical practice had similar inclusion/exclusion criteria.
Therefore, based on information from three of the four RCTs most generalizable to clinical practice that specified exclusion criteria, and for the reasons discussed above in Resource Use, Benefits and Harms, and Patients' Values and Preferences sections, the TF determined that HSAT should be used in an uncomplicated clinical population. This is defined as the absence of significant cardiopulmonary disease (e.g., heart failure, chronic obstructive pulmonary disease [COPD]), potential respiratory muscle weakness due to neuromuscular conditions, chronic opiate medication use, history of stroke, concern for a significant sleep disorder other than OSA (e.g., CSA, parasomnia, narcolepsy, severe insomnia), and environmental or personal factors that preclude the adequate acquisition and interpretation of data from HSAT.83,84,110
FOLLOW-UP: Based on information from the four RCTs most generalizable to clinical practice,83–85,110 the TF determined that HSAT should be used in the context of an OSA management pathway that incorporates a PAP therapy initiation protocol for APAP or PSG titration, early follow-up after initiation of therapy, and PSG titration studies for patients failing APAP therapy. All RCTs incorporated early follow-up of APAP titration (within 2–7 days after HSAT) by skilled technical staff.83–85,110 As described above, the recommendation for using HSAT to diagnose OSA is based on clinically significant improvements in clinical outcomes. Therefore, the TF determined that HSAT should be used in the context of an OSA management pathway that incorporates a PAP therapy initiation protocol and early follow-up after initiation of therapy.
CLINICAL EXPERTISE: All four RCTs that were most generalizable to clinical practice administered HSAT at academic or tertiary sleep centers with highly skilled sleep medicine providers and technical staff.83–85,110 HSAT recordings were reviewed by a sleep medicine specialist. One RCT that was not included in this subset (because an overnight oximetry was used as entry criteria) used a simplified nurse-led model of care involving nurse specialists experienced in management of sleep disorders (mean of 8.3 years of experience with CPAP therapy). Therefore, the TF determined that HSAT should be administered by an accredited sleep center under the supervision of a board-certified sleep medicine physician, or a physician who has completed a sleep fellowship, but is awaiting the next opportunity to take the board examination.
HOME SLEEP APNEA TESTING DEVICE: Among the four RCTs that were most generalizable to clinical practice, three used conventional Type 3 devices (nasal pressure, thoracic and abdominal excursion using RIP technology, oxygen saturation, EKG, body position, and oral thermistor in some cases),84,85,110 and one used a 4-channel device83 based on PAT with three additional channels (heart rate, pulse oximetry, and actigraphy). The TF determined that testing should be performed using these types of HSAT devices that have been demonstrated to be technically adequate. Additional guidance on technical specifications regarding HSAT is provided in The AASM Manual for the Scoring of Sleep and Associated Events.24
RECORDING TIME: In the four RCTs most generalizable to clinical practice, the minimum requirement for an acceptable study was 4 hours of adequate flow and oximetry signals.83–85,110 Whereas one HSAT study83 used PAT as a surrogate of flow, two studies recorded nasal pressure flow85,110 and one study recorded thermistor in addition to nasal pressure flow.84 The latter three studies also recorded thoracic and abdominal movements.84,85,110 All of these studies showed at least equivalence of adherence to PAP therapy and functional improvement in the home versus in-laboratory management pathways.84,85,110 Therefore, the TF determined that a protocol requirement of a minimum of 4 hours of good quality data from HSAT recording, during the habitual sleep period, is warranted to diagnose OSA.
Additionally, nine non-RCT validation studies reported minimum requirements for duration of acceptable signal quality.35,53,54,81,86,88,93,96,116 The required signals and minimum durations included nasal pressure flow and oximetry for at least 3 hours88,93,116 or 4 hours53,81,86,96 and single-channel nasal airflow recording for a minimum of 3 hours35 or only 2 hours.54 The diagnostic accuracy of the cardiorespiratory devices compared against PSG for the detection of OSA at different AHI cutoff points was relatively high. One study reported a sensitivity and specificity of 0.88 and 0.84, respectively, for a HSAT AHI cutoff point of ≥ 9 events/h.53 In a separate study, the sensitivity and specificity for unattended in-home PSG was 0.91 and 0.89 for an AHI cutoff of > 10 events/h, but 0.88 and 0.55, respectively for an AHI cutoff of > 5 events/h.86 In another study, at an AHI cutoff of > 10 events/h, HSAT had a sensitivity of 0.87, and a specificity of 0.86.88
Overall, the body of evidence investigating the minimum number of hours of adequate data on HSAT required to accurately diagnose OSA is very limited. There are no data to suggest that fewer than 4 hours of technically adequate recording compromises the accuracy of test results, and there is no direct evidence on the impact of a minimum number of recording hours of HSAT on clinical outcomes. Based on available indirect evidence, the TF weighed the “risk” of undergoing less than the required duration of good quality HSAT with resultant false negative (or false positive) results, against the “benefit” of potentially increasing the accuracy by performing PSG. Performing PSG in the scenario of a “positive” diagnosis of OSA is less likely to alter clinical decision-making and may, in fact result in unnecessary delays in care with increased cost. Conversely, a “negative” HSAT, in the scenario of a high pretest probability of OSA, will justify PSG even when the test is of adequate quality and duration. The TF believes that the goals of establishing an accurate diagnosis, while minimizing patient inconvenience and cost, align with patient preferences.
NIGHTS OF RECORDING TIME: The adequacy of a single night HSAT performed for the diagnosis of OSA in the context of an appropriate clinical population and management pathway is supported by published evidence. Our literature review only identified two studies relevant to the question of whether multiple nights of recording is superior to a single night.35,73 These studies evaluated the performance of multiple nights (3) of single channel HSAT device (i.e., nasal pressure transducer or oximetry) to the first night of recording. Utilizing PSG as the reference, the studies found that recording over three consecutive nights may decrease the probability of insufficient data and marginally improve accuracy when compared against a single night of recording. However, the TF considered this evidence insufficient to establish the superiority of multiple-night HSAT protocol over a single-night HSAT protocol, as the studies only included a single channel recording and did not evaluate clinically meaningful outcomes or efficiency of care.
A single HSAT recording encompassing multiple nights may have potential advantages or drawbacks relative to only a single night of recording. For example, if multiple-night HSAT improved accuracy or resulted in fewer inconclusive or inadequate studies, patient outcomes or costs might improve. On the other hand, the potential for multiple-night recordings to increase cost and patient inconvenience must be considered. Insufficient evidence exists to support routine performance of more than a single night's recording for HSAT.
Diagnosis of Obstructive Sleep Apnea in Adults with Comorbid Conditions
Recommendation 4: We recommend that polysomnography, rather than home sleep apnea testing, be used for the diagnosis of OSA in patients with significant cardiorespiratory disease, potential respiratory muscle weakness due to neuromuscular condition, awake hypoventilation or suspicion of sleep-related hypoventilation, chronic opioid medication use, history of stroke or severe insomnia. (STRONG)
Summary
This recommendation is based on the limited data available regarding the validity of HSAT in patients with significant cardiorespiratory disease, neuromuscular disease with respiratory impairment, suspicion of hypoventilation, opioid medication use, history of stroke, or severe insomnia. The overall quality of evidence was very low due to imprecision, indirectness, and risk of bias. The TF considered both the accuracy of HSAT for the detection of OSA, and the concurrent need to detect other forms of sleep-disordered breathing that can occur in these populations (e.g., CSA, hypoventilation and sleep-related hypoxemia). The likelihood of non-obstructive sleep-disordered breathing should be considered by the clinician, when determining which types and severity of cardiorespiratory diseases may be inappropriate for HSAT.
PSG is the gold standard method for the diagnosis of OSA and other forms of sleep-disordered breathing. HSAT has not been adequately validated or demonstrated to provide favorable clinical outcomes and efficient care in the above patient populations, and may result in harm through inaccurate assessment of sleep-disordered breathing.
Based on clinical judgment, the TF determined that the potential harms of using HSAT in the above patient populations outweigh the potential benefits. The TF also determined that patients value accurate OSA diagnosis, favorable clinical outcomes, and the identification of non-obstructive sleep-disordered breathing, and therefore would want to be evaluated by PSG.
Discussion
Four studies examining the validity of HSAT for the diagnosis of OSA in patient populations with significant cardiorespiratory morbidity met our inclusion criteria.117–120 No RCTs were identified that randomized patients with significant co-morbidities, as outlined above, to management pathways using either PSG or HSAT for diagnosis.
PATIENTS WITH COMORBID HEART FAILURE: Our review identified three studies that included patients with heart failure.117,119,120 A study of 50 patients with stable heart failure (Class 2–4; left ventricular ejection fraction < 40%) evaluated the performance of home oximetry against PSG.117 Home oximetry was considered positive if the 2% ODI ≥ 10, and the PSG was considered positive if AHI ≥ 15 using a hypopnea criteria that did not require oxygen desaturation or arousal. Home oximetry data was not obtained in 3 patients and oximetry had to be repeated in 2 patients. This study found oximetry to have a sensitivity of 0.85 and specificity of 0.93 in identifying sleep-disordered breathing.117 The specificity was poor for identifying CSA, based on desaturation/ resaturation patterns (specificity of 0.17; sensitivity of 1.0) with 10 of 12 patients with OSA identified as having CSA. A study of 50 patients with heart failure (Class 3; LVEF ≤ 35%) evaluating the performance of an HSAT device that included ECG (2 leads), oximetry, and respiratory impedance sensors against PSG, was able to obtain valid data in 44 patients in the home setting. Sensitivity, specificity and accuracy at AHI ≥ 5 and AHI ≥ 15 cutoffs were 0.92, 0.52, and 0.73 and 0.67, 0.78, and 0.75 respectively.119 Unfortunately, the performance of the device in distinguishing central from obstructive events was not evaluated.
A study of 100 patients with stable heart failure (mean LVEF ± SD: 34.6% ± 11) evaluated the performance of simultaneous 2-channel HSAT device (nasal pressure flow and oximetry) against unattended in-home PSG.120 In the 90 patients with valid HSAT recordings, the sensitivity and specificity was 0.98 and 0.60, respectively, using an AHI ≥ 5 cut off (hypopneas required 4% oxygen desaturation for both HSAT and PSG), and 0.93 and 0.92% using an AHI ≥ 15 cutoff. Among these patients, 29% had CSA, 19% had OSA, and 13% had both, based on PSG. The type of sleep apnea could not be determined using the HSAT device. Meta-analysis of these studies (see supplemental material, Table S61) found that in a population of 1,000 patients at high risk of moderate to severe OSA (64% prevalence), 45 to 230 more false negative and 18 to 79 more false positives would result from the use of HSAT.117,119,120 The quality of evidence for was downgraded to low due to imprecision and indirectness.
PATIENTS WITH COMORBID COPD: Only one study addressed the validity of HSAT (nasal pressure, respiratory excursion (piezoelectric sensor), body position and pulse oximetry) in patients with COPD.118 Of 72 patients with stable COPD (GOLD stage II and III) and symptoms of OSA, only 26 patients (36%) had HSAT studies of reasonable quality.118 When comparing HSAT to PSG, the intraclass correlation coefficient was 0.47 (accuracy not provided).118 Data regarding detection of hypoventilation was not provided. Evidence was downgraded to very low based on imprecision, indirectness, and risk of bias due to significant data loss.
PATIENTS WITH OTHER COMORBIDITIES: No studies were identified that met our inclusion criteria that specifically evaluated the use of HSAT for diagnosis of OSA in patients with history of stroke, chronic opioid medication use, neuromuscular disease with respiratory muscle impairment, high risk of hypoventilation, or severe insomnia. Therefore, the TF concluded that HSAT has not been adequately validated or demonstrated to provide favorable clinical outcomes and efficient care in these patient populations.
OVERALL QUALITY OF EVIDENCE: The evidence for the use of HSAT in diagnosis of OSA among patients with comorbid heart failure was based on three studies, and this evidence was downgraded to low because of imprecision and indirectness.117,119,120 The evidence for the use of HSAT in diagnosis of OSA among patients with COPD was based on a single, small study in which the majority of subjects had technically inadequate HSAT data due to recording failure. There was no direct evidence regarding suitability of HSAT for the diagnosis of OSA in patients with neuromuscular disease with respiratory impairment, hypoventilation, chronic opioid medication use, history of stroke, or severe insomnia. The overall quality of evidence for HSAT in patients with comorbid conditions was downgraded to very low due to imprecision, indirectness, and risk of bias.
BENEFITS VERSUS HARMS: Certain patient populations are at increased risk of having forms of SDB other than OSA (e.g., CSA, hypoventilation, and hypoxemia). These forms of SDB can cause significant morbidity and mortality if left untreated. HSAT has not been validated to diagnose some of these types of SDB (CSA, hypoventilation); therefore, the use of HSAT in populations at increased risk for SDB other than OSA increases the likelihood of not detecting these breathing disorders, which could lead to inadequate treatments, increased long-term healthcare costs, morbidity and mortality. In addition, the accuracy of HSAT has not been validated in patients with severe insomnia where it may be compromised leading to similar outcomes. Though the cost of diagnostic PSG is higher than HSAT, the TF determined that the benefits of increased accuracy, use of appropriate therapy, and improved clinical outcomes outweigh this factor. There are, however, instances where PSG cannot be performed for practical reasons (hospitalization, inability of patient to leave home setting or participate in PSG), and use of HSAT may be reasonable, as the alternative is to not addressing SDB at all.
PATIENTS' VALUES AND PREFERENCES: Based on clinical judgment, the TF determined that patients at increased risk for non-OSA SDB would want these breathing disorders to be adequately diagnosed and treated, as therapy of these disorders can result in significant improvement in health and well-being, and would therefore prefer PSG. Similarly, patients with severe insomnia needing evaluation of OSA would prefer PSG. If the optimal diagnostic test (PSG) was not feasible, then they would desire to have other diagnostic tests (i.e., HSAT) available that may aid their clinical provider in providing care for SDB.
Diagnosis of Obstructive Sleep Apnea in Adults Using a Split-Night versus a Full-Night Polysomnography Protocol
Recommendation 5: We suggest that, if clinically appropriate, a split-night diagnostic protocol, rather than a full-night diagnostic protocol for polysomnography be used in the diagnosis of OSA. (WEAK)
Remarks: Clinically appropriate is defined as the absence of conditions identified by the clinician that are likely to interfere with successful diagnosis and treatment using a split-night protocol.
This recommendation is based on a split-night protocol that initiates CPAP titration only when the following criteria are met: (1) a moderate to severe degree of OSA is observed during a minimum of 2 hours of recording time on the diagnostic PSG, AND (2) at least 3 hours are available for CPAP titration.
Summary
This recommendation is based on evidence from nine studies that included typical sleep clinic patients studied for symptoms of OSA. The quality of evidence was determined to be low due to imprecision, indirectness, and risk of bias. In the context of an appropriate protocol, a split-night study has acceptable accuracy to diagnose OSA in an uncomplicated adult patient and may improve efficiency of care when performed in the context of adequate clinical and technical expertise. The split-night protocol potentially provides enhanced efficiency of care by diagnosing OSA and establishing PAP treatment needs within a single night recording.
Many studies included in our review were retrospective case series, in which patients deemed clinically inappropriate for split-night study were unlikely to have been included. Therefore, there may be specific patient characteristics, not yet adequately defined in existing literature, that render patients ill-suited to the shorter diagnostic evaluation or titration period of the split-night study. Examples of such characteristics include severe insomnia, claustrophobia, concern for other forms of sleep-disordered breathing, or concern for non-breathing-related sleep disorders.
A split-night study may be preferred relative to full-night PSG and PAP titration studies due to the convenience and cost savings of completing a diagnostic and titration study during one rather than two separate PSG studies. However, this needs to be balanced with the consequences of potentially inconclusive diagnostic or titration portions of the sleep study. If the diagnostic portion is inconclusive, a second PSG is needed. If the titration portion is inconclusive, a second PAP titration study, or the use of autoadjusting PAP may be needed. Based on clinical judgment, the TF determined that the majority of well-informed patients would choose the split-night protocol over a full-night protocol, when clinically appropriate and feasible (Figure 2), and that the benefits of a split-night diagnostic protocol in such circumstances outweigh the potential harms.
Discussion
Our literature search yielded nine studies that met inclusion criteria.112,121–128 Three focused on the diagnostic accuracy of the initial portion of the PSG recording, against the accuracy of using the same full-night recording,121,122,127 and a fourth study compared the accuracy of the diagnostic portion of a split-night study against a separate full-night study.123 Three studies compared success of CPAP titration in those undergoing a split-night study against those undergoing a full-night sleep recording.125,126,128 One study compared CPAP adherence in those who underwent split-night studies against those who had full-night studies.124 A study that did not provide data suitable for inclusion in a meta-analysis examined cost-effectiveness of the split-night study versus the full-night study. Data from this study was considered in the evaluation of resource use.112
DIAGNOSTIC ACCURACY: Four studies that examined diagnostic accuracy and performance characteristics of a split-night protocol used the initial truncated PSG to serve as a representative surrogate of the initial diagnostic portion of a split-night study; the first 2–3 hours of the recording were compared to the full night of sleep recording.121–123,127 One study found that the 2-hour AHI and 3-hour AHI strongly correlated with the full-night AHI (concordance correlation coefficient = 0.93 and 0.97, respectively).121 This study reported a sensitivity of 0.80 (95% CI: 0.67 to 0.90) and specificity of 0.93 (95% CI: 0.83 to 0.98) using a cutoff of AHI ≥ 5, and a sensitivity of 0.77 (95% CI: 0.56 to 0.91) and specificity of 0.98 (95% CI: 0.92 to 1.00) using a cutoff of AHI ≥ 15 (see supplemental material, Table S62 and Table S63). When comparing 3 hours of recording versus the full-night recording, excellent consistency of the AHI was observed; there was no significant difference in the AHI derived from the first 3 hours of total sleep time versus the total sleep time (concordance correlation coefficient adjusted for REM and supine sleep of 0.96 and an accuracy of 93%),121 even in those with a milder degree of OSA (accuracy for AHI cutoffs of ≥ 5, ≥ 10 and ≥ 15 were 95, 97 and 99.5% respectively). One study assessed the diagnostic validity of a 2-hour recording and identified an optimal AHI cutoff of ≥ 30 events/h as providing the highest accuracy (90.9%).122 This study reported a specificity of 0.90 and a sensitivity of 0.92 (see supplemental material, Table S64). Another study showed an AHI Pearson correlation coefficient between a full-night study and the diagnostic portion of the split-night study of 0.63 when the split-night study recording time was ≥ 90 minutes.123 Finally, a study that compared sleep and respiratory parameters during the first 3 hours of the night against the values recorded during the entire night did not find a significant difference in AHI.127 Given the lack of definitive data, the TF elected not to designate a specific AHI threshold to inform the decision to initiate PAP titration during a split-night study protocol. The quality of evidence for diagnostic accuracy was downgraded to low due to indirectness, imprecision, and risk of bias.
CPAP OUTCOMES: Our literature review identified three studies that examined CPAP success in the split-night versus full-night CPAP titration recordings. One study, focused on upper airway resistance syndrome, found no difference in the success rates of CPAP titration, defined as a respiratory effort-related arousal (RERA) index < 5 on the final CPAP setting.125 A cross-over study involving comparisons of split-night CPAP recordings versus full-night CPAP titration recordings in patients with OSA, showed no significant difference of the AHI, arousal index and the percentage sleep time with oxygen saturation below 90% while on CPAP, though the final CPAP pressure was lower at the end of the split-night titration (8.8 versus 10.3 cm H2O).126 One study reported no clinically significant difference in adherence to CPAP treatment in patients undergoing a split-night study (78.7%) versus a full-night study with follow-up titration (77.5%)124 (see supplemental material, Figure S26, Figure S27, and Table S65). A meta-analysis of two studies (performed by the TF) comparing reduction of AHI after CPAP treatment with split-night PSG against full-night PSG found no clinically significant difference. The quality of evidence for CPAP outcomes was downgraded to low, due to imprecision associated with a limited number of studies and small sample size.
RESOURCE USE: A single cost-effectiveness analysis demonstrated that split-night studies were less costly than full-night studies based on cost per quality of life year (QALY) gained ($1,979 versus $2,092) and would be considered more cost-effective than full-night studies when third-party willingness to pay fell below $11,500 per QALY gained (a level of cost per QALY that would still be considered a good value for payers).112 However, the TF had low confidence in the certainty of resource use, given the lack of high quality evidence to inform cost effectiveness.
OVERALL QUALITY OF EVIDENCE: The available studies were methodologically limited due to a number of issues: use of suboptimal study designs (not RCTs), use of the initial portion of a full-night PSG recording as a surrogate for the baseline portion of a split-night study,121,127 and a lack of consistent use of standard monitoring (e.g., nasal pressure transducer).121 The overall quality of evidence was determined to be low due to a combination of imprecision, indirectness, and the risk of bias.
BENEFITS VERSUS HARMS: The split-night protocol, in comparison to a full-night baseline assessment followed by a separate PAP titration, has the potential to provide the needed diagnostic information and effective CPAP settings within the same recording. Potential disadvantages of the split-night study include insufficient diagnostic sampling (e.g., limited REM sleep time and limited supine time in those with difficulty initiating sleep), and insufficient time to ascertain appropriate CPAP treatment settings. Based on clinical judgment, the TF determined that there is low certainty that the benefits of a split-night study in comparison to full-night studies exceed the harms.
PATIENTS' VALUES AND PREFERENCES: When comparing the split-night study to the full-night study, existing data are consistent and demonstrate a high level of reproducibility of the standard AHI metric and effective identification of the optimal CPAP pressure. These data also suggest that the two approaches lead to similar follow-up CPAP adherence. Based on their clinical judgment, the TF members determined that the majority of well-informed patients would prefer a split-night protocol over a full-night protocol, when clinically appropriate and feasible (Figure 2), due to the lower cost, and the convenience of potentially completing a diagnostic and titration study during one sleep study. However, electing to use a split-night protocol still leaves the possibility that a patient will need to return for a second sleep study, if the diagnostic or titration portions of the split-night study are inconclusive.
Repeat Polysomnography for the Diagnosis of Obstructive Sleep Apnea in Adults
Recommendation 6: We suggest that when the initial polysomnogram is negative and there is still clinical suspicion for OSA, a second polysomnogram be considered for the diagnosis of OSA. (WEAK)
Summary
There was limited evidence from which to assess the efficacy of performing a repeat PSG when the initial PSG is negative. The recommendation is based on evidence from comparisons of a single-night PSG to two-nights of PSG for the diagnosis of OSA. These studies found no consistent differences overall in AHI scores, but potentially significant minorities of patients had results that were different in clinically meaningful ways on the two nights. The certainty in the evidence regarding night-to-night variability of AHI from the meta-analysis started as high, but there was limited evidence from which to assess the efficacy of single-night PSG versus two-night PSG in terms of diagnostic accuracy and clinical outcomes. This led to a downgrading of the overall quality of evidence to very low to reflect the low certainty of the TF that a repeat PSG would improve patient outcomes.
Discussion of a repeat PSG with a patient who has a negative initial PSG is warranted to ensure further testing accords with the patient's values and preferences, given the potential benefits and harms associated with additional testing. Proceeding with a second PSG in patients with a negative initial PSG, in order to establish a diagnosis of OSA, must be balanced against the possibility of a false positive diagnosis, inconvenience to the patient, and the added cost of a second study. Based on their clinical judgment, the TF members determined that the majority of well-informed symptomatic patients would choose a second PSG to diagnose suspected OSA when the initial PSG is negative. The TF also determined that the benefits of a second PSG outweigh the harms; however, the certainty that the benefits outweigh the harms is low.
Discussion
Our literature search identified four observational studies that compared AHI scores between two consecutive nights of PSG.34,129–131 There was a wide range of OSA severity within the populations included in the four studies (AHI range: 7–34). None of the studies included data on body position during the 2 nights of PSG. One of two studies that reported on sleep architecture changes130,131 found a statistically significant increase in REM sleep on the second PSG.131 Only one of the studies indicated that PSG scorers were blinded to the other PSG result.131
AHI (NIGHT-TO-NIGHT VARIABILITY): A meta-analysis of four studies compared AHI data between 2 consecutive nights of PSG34,129–131 (see supplemental material, Figure S28 and Table S66) and found the mean difference in the AHI between the 2 nights was 0.14 (95% CI: −1.86 to 2.15), which was not statistically or clinically significant. Nonetheless, a subset of individuals had considerable night-to-night variability in their AHIs, which could have potential clinical implications if the AHI crosses a treatment threshold only during the second PSG. Using an AHI cutoff of ≥ 5 to diagnose OSA, three of the studies34,130,131 identified that 9.9% to 25% of subjects had an AHI < 5 on the first PSG but an AHI ≥ 5 only on the second PSG. Likewise, using an AHI cutoff of ≥ 15 or 20 as a potential treatment threshold, 2 of the studies34,130 observed that 7.6% and 25% of subjects crossed this threshold only on the second study. OSA severity was also noted to vary in a subset of subjects with 26% to 35% changing the severity classification of their OSA (in either direction) on the 2 nights, though the majority were a shift of a single category (e.g., mild to moderate).34,130 The quality of evidence for night-to-night variability was high.
OVERALL QUALITY OF EVIDENCE: The overall quality of evidence for comparing night-to-night AHI variability was originally considered high, due to precise and consistent data across studies.34,129–131 However, the available literature did not address other clinically meaningful outcomes (e.g., impact on costs, QOL, comorbidities and long-term outcomes) resulting from undergoing a second night of PSG testing. As such, the TF downgraded the overall quality of evidence supporting this recommendation to very low, to reflect the likelihood that future research could result in different estimates of effect for the outcomes of interest, many of which were not available in the current literature.
BENEFITS VERSUS HARMS: A second night of PSG in symptomatic patients allows for the diagnosis of OSA in 8% to 25% of patients with initial false negative studies. Establishing a diagnosis of OSA in these patients allows for treatment that leads to improved symptom control (e.g., less daytime sleepiness), better QOL, and potentially decreased cardiovascular morbidity over time. However, routinely repeating a PSG in patients with an initial negative PSG has potential downsides. There is a risk that repeat testing could lead to false positive cases being identified, and unnecessarily treated. In addition, the routine use of a 2-night study protocol would cause inconvenience to the patient, increased utilization of resources and healthcare costs, and perhaps even delays in the care of other patients awaiting PSG. However, due to the increased likelihood of diagnosing symptomatic patients, and based on their clinical judgment, the TF determined that the benefits of a second PSG outweigh the harms; though the certainty that the benefits outweigh the harms is low.
PATIENTS' VALUES AND PREFERENCES: Patient preference was also considered when weighing the values and trade-offs of a repeat PSG in a patient suspected of having OSA with an initial false negative study. The patient's desire and motivation for further testing can be affected by a variety of factors from the patient's perspective (e.g., QOL, costs) and thus a discussion with the patient is warranted prior to pursuing repeat testing. Based on their clinical judgment, the TF members determined that the majority of well-informed symptomatic patients would choose a second PSG to diagnose suspected OSA, when the initial PSG is negative.
DISCUSSION AND FUTURE DIRECTIONS
This systematic literature review identified many areas that warrant additional study to better inform clinical decision-making and improve patient outcomes.
More accurate and user-friendly clinical screening tools and models are needed to better predict presence and severity of OSA, as well as to improve risk stratification and efficiency of patient management. Identification of biomarkers that detect obstructive sleep-disordered breathing and predict likelihood of adverse clinical outcomes could provide novel information that may improve the diagnosis and management of OSA. These advancements could also improve the efficiency by which conventional sleep apnea tests that measure the physiology of breathing during sleep are used. In addition, these approaches may be useful in situations where conventional tests may not be readily available or logistically feasible to conduct in a timely fashion (e.g., inpatient settings, preoperative clinics).
The current literature is limited, as the majority of study populations included mostly men and had limited ethnic and racial diversity. Therefore, more studies in women and non-Caucasians that elucidate optimal OSA screening methodology, diagnostic approaches and management pathways are needed. These groups may present with different OSA symptoms and have different preferences with regard to, and outcomes in response to, specific OSA diagnostic and management approaches.
For patients scheduled for upper airway surgery for snoring, there is currently insufficient evidence to determine if the diagnostic evaluation of OSA can decrease peri-operative risk and improve surgical outcomes. Because it has been established that questionnaires cannot be used to diagnose OSA, many sleep experts have followed previous guidelines recommending diagnostic testing to evaluate for OSA prior to performing surgery for snoring. Further research to evaluate this protocol would be useful.
While PSG remains the gold standard for the diagnosis of OSA, it involves cumbersome sensors and devices that, if minimized and less obtrusive, could make PSG more tolerable for patients. Newer technology that is less intrusive and more comfortable may influence patient preferences regarding diagnostic approaches. Split-night PSG testing, which may improve the efficiency of PSG, has not been adequately studied. The quality of evidence regarding split-night sleep studies is low and additional research is needed to better determine its overall impact on patient outcomes. Past research often utilized outmoded testing methodology (e.g., they did not use nasal pressure cannulas) or outdated scoring criteria, limiting its relevance. There is also a lack of data on the utility of split-night testing in patients with significant underlying cardiopulmonary disease. Finally, the cost-effectiveness of split-night studies warrants further exploration.
Significant progress has been made in better understanding the accuracy and clinical utility of HSAT, but more is needed. Future research should focus on evaluating HSAT devices in patients with different pretest probabilities for OSA, and in more diverse patient populations, especially those routinely excluded (e.g., at risk for hypoventilation and CSA) from past studies, and in those unable to be studied in the sleep laboratory environment (e.g., due to critical illness, immobility, safety). In addition, the types and numbers of HSAT sensors necessary to adequately diagnose OSA require elucidation. Research should focus on how to better define the optimal physiologic parameters to be measured, particularly concerning the minimal number of parameters necessary and how devices measuring different parameters compare with one another and in different clinical situations. Furthermore, a better understanding of factors associated with inadequate or failed HSAT could help to optimize efficiency of care with regards to choosing the most appropriate diagnostic method for a given patient and clinical situation. Greater study of the cost-effectiveness of home-based management is needed to better define situations in which it may or may not offer value to the healthcare system relative to laboratory-based management. Finally, there is a paucity of data on how patient preferences currently influence clinical decision-making regarding the type of diagnostic testing. The role of patient preference regarding diagnostic pathways (i.e., HSAT versus PSG) and how this may impact outcomes remains to be explored.
More work is needed to determine the duration and number of nights that are optimal for diagnostic testing. For example, when is a second night of PSG indicated in patients suspected of having OSA but who have a negative initial study? Future studies should attempt to determine factors that may predict which patients may benefit from a second night of PSG and measure the impact on clinically meaningful outcomes (e.g., impact on costs, QOL and medical morbidity). Likewise, the duration and number of testing nights required to accurately diagnose or exclude a diagnosis OSA with HSAT is in need of further study. In terms of the minimal duration of HSAT recording time, future comparative effectiveness studies should consider the impact of HSAT duration on clinical accuracy, clinical efficiency, and functional outcomes. Comparative effectiveness studies should also consider the impact of the number of nights of HSAT on clinically meaningful outcomes and efficiency of care (e.g., time to treatment and costs).
Finally, there is a need for controlled trials to determine the role of repeat testing during chronic clinical management. There was insufficient evidence to determine whether, and under what scenarios, repeat PSG or HSAT to confirm severity of OSA or efficacy of therapy improves outcomes relative to clinical follow-up without retesting.
DISCLOSURE STATEMENT
The development of this clinical practice guideline was funded by the American Academy of Sleep Medicine. Dr. Auckley receives royalties from Up-to-Date and is a consultant for the American Board of Internal Medicine. Dr. Chowdhuri has received research support from the Veteran's Health Administration. Dr. Mehra has received research support from Philips Respironics and Resmed in the form of equipment used in clinical research. Dr. Mehra also received grant support from the NIH/NHLBI and royalties from Up-to-Date. Mr. Harrod is an employee of the American Academy of Sleep Medicine. The other authors have indicated no financial conflicts of interest.
ACKNOWLEDGMENTS
The task force thanks and acknowledges the contributions of Reem Mustafa, MD, MPH, PhD for her work as a methodologist. The task force also thanks and acknowledges the contributions of Lauren Loeding, MPH, for her preliminary work on this guideline.
REFERENCES
Articles from Journal of Clinical Sleep Medicine : JCSM : Official Publication of the American Academy of Sleep Medicine are provided here courtesy of American Academy of Sleep Medicine
Full text links
Read article at publisher's site: https://doi.org/10.5664/jcsm.6506
Read article for free, from open access legal sources, via Unpaywall: https://jcsm.aasm.org/doi/pdf/10.5664/jcsm.6506
Citations & impact
Impact metrics
Article citations
Clinical review of non-invasive ventilation.
Eur Respir J, 64(5):2400396, 07 Nov 2024
Cited by: 0 articles | PMID: 39227076 | PMCID: PMC11540995
Review Free full text in Europe PMC
Cost-Effectiveness of Sleep Apnea Diagnosis and Treatment in Hospitalized Persons With Moderate to Severe Traumatic Brain Injury.
J Head Trauma Rehabil, 39(6):E498-E506, 04 Nov 2024
Cited by: 0 articles | PMID: 38652666
Increased fatigability and impaired skeletal muscle microvascular reactivity in adults with obstructive sleep apnea: a cross-sectional study.
Eur J Med Res, 29(1):506, 21 Oct 2024
Cited by: 0 articles | PMID: 39428454 | PMCID: PMC11492616
Survival and Risk Factors Associated with Mortality in Patients with Sleep Apnoea in Colombia: A Retrospective Cohort Study.
Nat Sci Sleep, 16:1601-1610, 09 Oct 2024
Cited by: 0 articles | PMID: 39399825 | PMCID: PMC11470772
Circadian clock dysregulation: a potential mechanism of depression in obstructive sleep apnea patients.
Transl Psychiatry, 14(1):423, 07 Oct 2024
Cited by: 0 articles | PMID: 39375341 | PMCID: PMC11458778
Go to all (1,040) article citations
Other citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Treatment of Adult Obstructive Sleep Apnea with Positive Airway Pressure: An American Academy of Sleep Medicine Clinical Practice Guideline.
J Clin Sleep Med, 15(2):335-343, 15 Feb 2019
Cited by: 239 articles | PMID: 30736887 | PMCID: PMC6374094
Clinical Practice Guideline for the Pharmacologic Treatment of Chronic Insomnia in Adults: An American Academy of Sleep Medicine Clinical Practice Guideline.
J Clin Sleep Med, 13(2):307-349, 15 Feb 2017
Cited by: 430 articles | PMID: 27998379 | PMCID: PMC5263087
American Academy of Sleep Medicine Position Paper for the Use of a Home Sleep Apnea Test for the Diagnosis of OSA in Children.
J Clin Sleep Med, 13(10):1199-1203, 15 Oct 2017
Cited by: 65 articles | PMID: 28877820 | PMCID: PMC5612636
Clinical Practice Guideline for the Treatment of Obstructive Sleep Apnea and Snoring with Oral Appliance Therapy: An Update for 2015.
J Clin Sleep Med, 11(7):773-827, 15 Jul 2015
Cited by: 293 articles | PMID: 26094920 | PMCID: PMC4481062
Review Free full text in Europe PMC