Psychological Medicine
© Cambridge University Press 2000

Volume 30(3)             May 2000             pp 529-544
The stability of child abuse reports: a longitudinal study of the reporting behaviour of young adults
[Original Articles]

FERGUSSON, D. M.1; HORWOOD, L. J.; WOODWARD, L. J.

From the Christchurch Health and Development Study, Christchurch School of Medicine, Christchurch, New Zealand
1 Address for correspondence: Professor David M. Fergusson, Christchurch Health and Development Study, Christchurch School of Medicine, PO Box 4345, Christchurch, New Zealand.


Outline

Graphics

ABSTRACT

Background. The aims of this study were to use longitudinal report data on physical and sexual abuse to examine the stability and consistency of abuse reports.

Methods. The study was based on the birth cohort of young people studied in the Christchurch Health and Development Study. At ages 18 and 21 years, these young people were questioned about their childhood exposure to physical punishment and sexual abuse. Concurrent with these assessments, sample members were also assessed on measures of psychiatric disorder and suicidal behaviour.

Results. Reports of childhood sexual abuse and physical punishment were relatively unstable and the values of kappa for test-retests of abuse reporting were in the region of 0.45. Inconsistencies in reporting were unrelated to the subject's psychiatric state. Latent class analyses suggested that: (a) those not abused did not falsely report being abused; and (b) those who were abused provided unreliable reports in which the probability of a false negative response was in the region of 50%. Different approaches to classifying subjects as abused led to wide variations in the estimated prevalence of abuse but estimates of the relative risk of psychiatric adjustment problems conditional on abuse exposure remained relatively stable.

Conclusions. There was substantial unreliability in the reporting of child abuse. This unreliability arose because those who were subject to abuse often provided false negative reports. The consequences of errors in reports appear to be: (a) that estimates of abuse prevalence based on a single report are likely to seriously underestimate the true prevalence of abuse; while (b) estimates of the relative risk of psychiatric adjustment problems conditional on abuse appear to be robust to the effects of reporting errors.



INTRODUCTION

There has been a large amount of research conducted into the prevalence, correlates and consequences of child abuse, including both physical and sexual abuse. A central problem in this area has concerned the measurement and assessment of abuse (Widom, 1989, 1997; Briere, 1992; Fergusson & Mullen, 1999). In particular, because of the ethical and practical problems associated with the assessment of child abuse, it proves difficult to conduct prospective research in which the exposure of large and representative samples to child abuse is assessed in a standardized and unbiased way (Widom, 1989; Maughan & Rutter, 1997). In general, there have been two approaches to addressing this problem of measurement. The first has been to use official records or agency samples of children who are known to be abused and to compare these children with a control series (Burgess & Conger, 1978; Trickett & Susman, 1988; Williams, 1994). The second approach has been to use representative adult samples and obtain reports of child abuse on the basis of retrospective recall (Anderson et al. 1993; Elliott & Briere, 1995; Fergusson et al. 1996a; Fleming, 1997). Both assessment approaches have liabilities. First, the use of agency samples or official record data may lead to the selection of samples of abused children that are not fully representative of all cases of child abuse (Ammerman, 1998). The use of such biased samples may, in turn, influence estimates of the prevalence, correlates and consequences of abuse. Secondly, retrospective reports of child abuse may be influenced by errors of reminiscence and recall bias that may, in turn, adversely affect estimates of the prevalence, correlates and consequences of abuse (Briere, 1992; Brewin et al. 1993; Finkelhor, 1994; Williams, 1994; Maughan & Rutter, 1997).

Recently, there has also been a growing literature that has examined sources of fallibility in retrospective reports of child abuse. This research has emphasized the fact that retrospective reports of child abuse are likely to be subject to substantial errors of measurement. The most compelling evidence for this view comes from studies in which adults known to have been physically or sexually abused during childhood have been questioned about their childhood experiences (Herman & Schatzow, 1987; Williams, 1994; Maughan et al. 1995; Johnson et al. 1999). For example, Williams (1994) showed that in a sample of 129 women who were known to have been sexually abused during childhood, a large proportion of these women (38%) did not recall the specific incident of abuse that brought them to official attention. Similarly, Widom and her colleagues have reported a series of studies in which they examine the accuracy with which those known to have been abused during childhood provide accurate accounts of abuse as adults (Widom & Shepard, 1996; Widom & Morris, 1997). In common with the findings reported by Williams (1994), these studies have generally suggested that a substantial proportion of individuals exposed to abuse during childhood fail to report this abuse in adulthood, with these trends being particularly evident for males. More generally, all studies that have examined the accuracy with which adults known to have been exposed to childhood physical or sexual abuse report these experiences suggest considerable under-reporting of abuse experiences (Herman & Schatzow, 1987; Finkelhor, 1994; Williams, 1994; Maughan et al. 1995; Widom & Morris, 1997).

Three general sets of factors may lead to poor reporting of adverse childhood experiences. First, evidence suggests that autobiographical memories of traumatic events may be subject to forgetting and reconstruction in much the same way as memories of non-traumatic events, with increasing length of time between the event and recall being associated with a corresponding decline in recall accuracy and detail. Considerable attention has also been given to the extent to which rates of forgetting may be influenced by contextual factors associated with the abuse, including the age of the child at the time of the abuse, chronicity of the abuse, the victim's relationship to the perpetrator, and the extent to which the abuse involved threat or physical/sexual violence. Although available findings are somewhat mixed, there is a general consensus that abuse at an early age (< 5 years) by a close family member may be more susceptible to processes of forgetting than abuse by a stranger or abuse occurring at an older age (Herman & Schatzow, 1987; Briere & Conte, 1993; Williams, 1994; Elliott & Briere, 1995).

It has also been suggested that retrospective reports of childhood abuse may be influenced by an individual's current mood state or symptomatology, with maladjusted individuals being more likely to report earlier exposure to physical or sexual abuse than their well-functioning peers (Lewinsohn & Rosenbaum, 1987; Widom, 1989). However, there is now growing evidence to suggest that rates of forgetting may be largely unaffected by the psychological state of the respondent (Brewin et al. 1993; Maughan et al. 1995; Robins et al. 1985), and that similar rates of recall tend to be found among well- and poor-functioning respondents (Robins et al. 1985).

Secondly, related to the issue of forgetting, is the increasingly contentious notion that memories of traumatic childhood experiences may be actively repressed by the victim as a means of self-protection (Holmes, 1990; Pope & Hudson, 1995; Penfold, 1996; Memon & Young, 1997; Loftus et al. 1998). However, despite claims that repression may be relatively common among victims of abuse (Harvey & Herman, 1994; Olio & Cornell, 1994), extensive reviews of laboratory and other evidence indicate that insufficient evidence exists to suggest that an individual is capable of actively repressing earlier memories of child abuse (Holmes, 1990; Pope & Hudson, 1995; Loftus et al. 1998).

Thirdly, given the highly sensitive and personal nature of child abuse it is possible that an individual's failure to report these experiences within the context of a research interview or questionnaire may reflect other factors, including embarrassment, a desire to protect the perpetrator, and/or a desire to forget or avoid discussing these experiences (Femina et al. 1990; Melchert & Parker, 1997). For example, it is possible that in addition to those who have forgotten (or repressed) memories of earlier abuse, some respondents may actively choose not to disclose this information in order to avoid being reminded about or having earlier unpleasant experiences intrude into their current life.

Collectively, these findings suggest that although many individuals provide accurate retrospective reports of earlier exposure to physical and sexual abuse, a substantial amount of under-reporting also occurs. Furthermore, an examination of why these inaccuracies in abuse recall are found suggests that retrospective reports of traumatic experiences may potentially be susceptible to multiple sources of fallibility that include: normal processes of forgetting and reconstruction; a respondent's psychological state at the time of reporting; unconscious repression; and/or a conscious reluctance on the part of a respondent to report past painful or embarrassing experiences.

Although there has been substantial attention focused on the reasons for fallibility in retrospective child abuse reports, less attention has been paid to the consequences of this fallibility for estimates based on these reports. While it is clear that the under-reporting of abuse will lead to a mis-estimation of the prevalence of abuse in the population, there is also a need to consider the extent to which under-reporting may influence estimates of associations between measures of child abuse and other factors. The extent to which errors of reporting influence estimates of association is likely to depend critically on the nature of under-reporting. In particular, if under-reporting occurs in a way that is statistically independent of psychiatric outcome or similar measures, the effects of under-reporting on estimates of association may be relatively small. However, if errors of reporting are correlated with the outcomes being studied, the effects of these errors on estimates of association may be more far reaching. For example, in studies of the association between psychiatric disorder and abuse reports, it is possible that the reporting of abuse is influenced by current psychiatric state so that those with current psychiatric difficulties may be more prone to report and recall child abuse. Under these circumstances, recall bias in the reporting of abuse would lead to an upward bias in the association between child abuse and later disorder. Conversely, it is possible that individuals with a psychiatric disorder may be less prone to report child abuse than individuals without a disorder. This situation could arise if, for example, those exposed to abuse tended to develop disorders in which the repression or forgetting of an abuse experience constituted a symptom of the disorder. Under these circumstances, the association between child abuse and later disorder would be under-estimated. These considerations clearly suggest the need for analyses to consider not only the degree of error present in reports of child abuse, but also the likely consequences of measurement errors for inferences drawn about the prevalence and outcomes of child abuse.

One approach to examining this issue is to employ a test-retest paradigm in which a large and representative sample of respondents is questioned about their earlier child abuse experiences on two or more occasions, with these assessments being spaced sufficiently far apart to ensure that reports of abuse are made independently of each other (Dill et al. 1991; Martin et al. 1993; Fry et al. 1996). This paradigm makes it possible to assess the stability with which child abuse experiences are reported, and also to examine the extent to which stability and instability in reporting over time are related to other characteristics of the individual such as psychiatric state at the time of reporting.

Against this background, this paper reports on a longitudinal study in which a large and representative sample of New Zealand young adults have been questioned about both childhood physical and sexual abuse at the ages of 18 and 21 years using the same set of questions. This research design made it possible to examine the stability with which child abuse was reported by cohort members and to examine the extent to which instability in reports of abuse was related to changing psychiatric state. The study also allowed an examination of the extent to which different methods of measuring child abuse lead to different estimates of abuse prevalence and different estimates of the association between childhood abuse and later psychiatric adjustment. More generally, the aims of this paper were to use test-retest data on reports of childhood physical and sexual abuse to assess the extent of errors in these reports, and the likely consequences of measurement errors for estimates of the prevalence of abuse and the association between abuse and psychiatric adjustment.

METHOD

The data described in this report were gathered during the course of the Christchurch Health and Development Study (CHDS). The CHDS is a longitudinal study of an unselected birth cohort of 1265 children (635 males, 630 females) born in the Christchurch (New Zealand) urban region over a 4-month period during mid-1977. This cohort has been studied at birth, 4 months, 1 year and annual intervals to age 16 years, 18 and 21 years using information gathered from a combination of sources, including parent interviews, teacher reports, psychometric testing, child interviews, medical, police and other records. An overview of the study design has been given previously (Fergusson et al. 1989).

Measures

At age 18 and 21 years, sample members were interviewed on a structured questionnaire that examined a range of mental health issues, including childhood exposure to sexual or physical abuse, symptoms of psychiatric disorders and related problems of adjustment. Interviews typically lasted between 1.5 to 2 h and were administered in private by trained and experienced female interviewers recruited for the project. In all cases, the release of interview data was subject to signed and informed consent from the respondent. The following measures were used in the present analysis.

Childhood sexual abuse

As part of the interview conducted at ages 18 and 21 years, young people were asked whether, before the age of 16, anyone had ever attempted to involve them in any of a series of 15 sexual activities when they did not want this to happen. These activities spanned: (a) non-contact episodes, including indecent exposure, public masturbation by others and unwanted sexual propositions or lewd suggestions; (b) incidents involving sexual contact in the form of sexual fondling, genital contact or attempts to undress the respondent; (c) incidents involving attempted or completed vaginal, oral or anal intercourse (Fergusson et al. 1996b). Young people who reported having experienced any of these behaviours before the age of 16 were then asked, for each perpetrator involved, a further series of questions concerning the nature and extent of abuse, the characteristics of the perpetrator, abuse disclosure and treatment seeking or counselling subsequent to abuse. Information on these issues was gathered using a combination of pre-coded survey items and open-ended questions (Fergusson et al. 1996b).

Examination of the childhood sexual abuse (CSA) report data suggested that these reports varied markedly in the severity and extent of abuse, ranging from single incidents of non-contact abuse to repeated episodes of sexual violation. To enable examination of the consistency of reporting of CSA between the ages of 18 and 21, while at the same time taking account of the variability in the severity and nature of sexual abuse, a range of alternative definitions of CSA were considered. These definitions classified CSA at each age on the basis of a set of increasingly stringent criteria for abuse and were as follows: (1) any sexual abuse (under this definition, young people were classified as experiencing abuse if they reported any form of sexual abuse, including non-contact abuse); (2) contact sexual abuse (under this definition, young people were classified as being abused if they reported sexual abuse involving any form of physical contact with a perpetrator); and, (3) intercourse sexual abuse (under this definition, young people were classified as abused if they reported any incident(s) involving attempted or completed vaginal, oral or anal intercourse).

The prevalence of CSA according to each of these definitions is provided in the Results section. However, in the interests of brevity, the principal analyses in the Results are based on the first criterion, any sexual abuse, and the corresponding analyses for other sexual abuse criteria are summarized in the Supplementary Analyses section.

Childhood physical abuse

The assessment of physical abuse was based on young people's reports of parental use of physical punishment. At ages 18 and 21 years, respondents were asked to report on the extent to which their parents used methods of physical punishment during their childhood years (prior to age 16). Reports were made on a five-point scale: (1) parent never used physical punishment; (2) parent seldom used physical punishment; (3) parent regularly used physical punishment; (4) parent used physical punishment too often or too severely; and, (5) parent used physical punishment in a harsh and abusive way. Separate ratings were obtained for the child's mother figure and father figure wherever possible. Ratings for both parents were then combined into a single rating at each age by classifying the young person's exposure to physical abuse based on the greatest level of exposure to physical punishment from either parent (Fergusson & Lynskey, 1997). To enable an examination of the consistency of reporting of physical abuse between 18 and 21 years across a range of measures of varying abuse severity, two alternative classifications of physical abuse were considered: (1) regular physical punishment (under this definition, the young person was classified as being abused if s/he reported that at least one parent had regularly used physical punishment, or had used a more severe form of physical punishment during childhood); and (2) severe/harsh physical punishment (under this definition, the young person was classified as abused if s/he reported that at least one parent had used physical punishment too often/severely, or had treated the respondent in a harsh/abusive manner).

The prevalence of physical abuse under each definition at each age is given in the Results section. In the interests of brevity, the principal analyses in the Results are based on the measure of regular physical punishment, and the analyses for severe/harsh punishment are summarized in the Supplementary Analyses section.

Psychiatric adjustment (16-21 years)
Psychiatric disorder

At ages 18 and 21, young people were questioned concerning their psychiatric symptomatology between the ages of 16-18 years and 18-21 years respectively, using a questionnaire that combined elements of the Composite International Diagnostic Interview (CIDI: World Health Organization, 1993) and the Self-Report Delinquency Inventory (SRDI: Elliott & Huizinga, 1989). On the basis of these data, DSM-IV (American Psychiatric Association, 1994) symptom criteria were used to classify young people according to a series of psychiatric disorder diagnoses over each assessment period. These disorders included: (a) major depression; (b) anxiety disorders (generalized anxiety disorder, panic/agoraphobia, specific phobia, social phobia); (c) conduct disorder; and (d) alcohol, cannabis or other illicit drug dependence. Items from the CIDI were used to assess depression, anxiety disorders and substance dependence, while items from the SRDI were used to assess the presence of conduct disorder in the sample. A detailed description of these measures has been provided by Fergusson et al. (1996a). To provide an overall assessment of psychiatric disorder, an additional disorder classification based on the presence of any disorder (depression, anxiety, conduct or substance dependence) during the interval was also created.

Suicidal behaviour

Parallel to questioning on psychiatric symptomatology, at each interview young people were also asked a series of questions concerning suicidal behaviour. Specifically, sample members were asked to indicate whether they had thought about taking their life by suicide during the periods from 16-18 years and 18-21 years respectively. Respondents who reported having suicidal thoughts at any point were then asked a further series of questions about: (a) the nature, frequency and reasons for their thoughts; (b) whether they had made a suicide attempt during the interval, and the nature and outcome of any attempt(s).

In the present analysis, the measures of psychiatric disorder and suicidal behaviour have been used in a number of ways. First, measures of disorder and suicidal behaviour during each assessment period (16-18 years; 18-21 years) were related to variability in the reporting of abuse to examine whether patterns of stability/instability in abuse reporting were correlated with current, past or future psychiatric adjustment. Secondly, a combined estimate of the prevalence of each measure of psychiatric adjustment over the period from 16-21 years was used to examine variability in the relative risk of psychiatric adjustment problems in relation to the way in which abuse was classified and the stringency of the abuse definition adopted. The combined estimate was used to minimize the amount of tabular material presented concerning the relative risk of adjustment problems. However, identical conclusions were drawn from parallel analyses of the separate 16-18 year and 18-21 year measures of adjustment.

Sample size and sample bias

The present analysis is based on the 983 sample members who were interviewed on all measures of child abuse and psychiatric adjustment at ages 18 and 21 years. This sample represented 78% of the original cohort of 1265 children and 88% of the cohort members who were still alive and resident in New Zealand at age 18. Three sample members declined to answer the questions on sexual abuse at age 21, leaving a final sample of 980 for the sexual abuse analyses.

To examine the effects of sample loss on the representativeness of the sample, the 983 cohort members included in this study were compared with those cohort members excluded from the analysis on a series of family social background measures collected at the time of the child's birth. This analysis suggested that losses to follow-up were not associated with maternal age, child ethnicity, gender or birth order in the family. However, there were small but statistically significant (P < 0.01) tendencies for the present sample to under-represent children from families in which mothers lacked educational qualifications, children who entered single parent families and children from lower socio-economic status families. While these results suggest some bias in the present sample towards the under-representation of children from socially disadvantaged backgrounds, previous analyses of data from this cohort in which efforts have been made to correct for the effects of sample selection bias have shown these effects to be negligible (Fergusson & Lloyd, 1991; Fergusson et al. 1997). This would suggest that any selection bias in the sample is unlikely to materially influence the results reported here.

RESULTS
The stability of child abuse reports (18-21 years)

Table 1 shows the relationships between reports of any childhood sexual abuse and regular physical punishment made at ages 18 and 21. It is clear that there was relatively poor agreement between the reports made at 18 and 21 years. This poor agreement may be described in a number of ways, as follows.


Graphic
Table 1. Frequency distribution of abuse reports at ages 18, 21 years

1 Of those reporting childhood sexual abuse or regular physical punishment at the age of 18 in the region of 50% failed to subsequently report these events at age 21.

2 Similarly of those reporting childhood sexual abuse or physical punishment at age 21 in the region of 50% had failed to report these events at age 18.

3 The degree of agreement between reports at ages 18 and 21 is given by the kappa statistic which shows only a modest level of agreement between reports provided at the two ages. The value of kappa for sexual abuse ([kappa] = 0.45; P < 0.0001) is similar to the value of kappa for physical punishment ([kappa] = 0.47; P < 0.0001) suggesting that the reporting of both outcomes was subject to considerable instability between the two reporting periods.

Stability of reporting and psychiatric status

The clear instability in abuse reports that is evident in Table 1 raises the issue of the extent to which changes in the reporting of child abuse were related to changes in the individual's psychiatric status. This issue was tested by classifying those who reported any sexual abuse into three groups: (a) those who reported abuse at age 18 but not at 21; (b) those who reported abuse at age 21 but not at age 18; and (c) those who reported abuse at both age 18 and 21. The membership of these sexual abuse reporting groups was then related to measures of psychiatric status observed at ages 18 and 21. Psychiatric outcomes considered included: major depression; anxiety disorder; conduct disorder; alcohol or illicit drug dependence; any form of disorder; suicidal ideation and suicide attempt. A similar analysis was conducted using reports of regular physical punishment. With two abuse measures (any sexual abuse; regular physical punishment) observed at two times (18 and 21) and related to seven outcome measures, there was a total of 28 comparisons of the psychiatric history of those with different abuse reporting histories. Of these 28 comparisons, three proved to be statistically significant. However, application of a Bonferroni corrected significance level (Grove & Andreason, 1982) showed that none of these associations reached the required significance level (P < 0.002) taking into account the multiple tests being conducted. These results suggest that variations in the reporting of both sexual and physical abuse were generally unrelated to the individual's psychiatric status at the time of, prior to, or following the reported abuse. In particular, there was little systematic evidence to suggest that individuals who provided unstable reports of abuse in which abuse was reported at one age but not another had a different psychiatric history from those who provided stable reports of abuse. In turn, these findings clearly suggest a process in which the errors of measurement in abuse reports were statistically independent of the individual's psychiatric status and history.

Latent class analysis

A further issue raised by the results in Table 1 concerns the possibility of combining reports obtained at the two ages to obtain a better measure of abuse than was provided by the separate reports. To achieve this requires the development of a statistical model linking the observed report data to the individual's true but non-observed abuse status. One method of approaching this problem is to fit a latent class model (Goodman, 1974a, b; Clogg, 1995) to the observed data to estimate from the observed responses: (a) the true prevalence of abuse; and (b) the probability that those who were subject to abuse would report this event and the probability that those who were not abused would accurately report that they were not abused. The application of the latent class model to the present data is described in Appendix 1. The final model fitted for both the sexual abuse and physical punishment data assumed: (a) that the prevalence of abuse may vary with gender; (b) that the reports of abuse made at ages 18 and 21 were of similar accuracy; (c) reporting accuracy was similar for males and females; and (d) that false positive reports in which those who were not abused reported abuse did not occur.

For both physical and sexual abuse, these assumptions produced well fitting and parsimonious models of the observed data. The results of the latent class analyses are summarized in Table 2. This Table shows, for both any sexual abuse and regular physical punishment: (a) estimates of the true proportion of the sample who were subject to abuse; (b) estimates of the rate of false negative reports in which those subject to abuse failed to disclose abuse; and (c) the goodness-of-fit of the model based on the log likelihood ratio chi-squared test.


Graphic
Table 2. Fitted latent class models of the reporting of any sexual abuse and regular physical punishment

The fitted models confirm the impression conveyed by Table 1 that reports of abuse were subject to substantial measurement error, with these errors arising from false negative reports in which those who were subject to abuse failed to report these events. The fitted models suggest that for both any sexual abuse and regular physical punishment, the rate of false negative reports was in the region of 50%, implying that only half of those subject to abuse, in fact, reported abuse at each assessment.

The consequences of measurement error for analyses of abuse reports

The preceding analyses set the stage for an exploration of the consequences of errors of measurement in reports of abuse. Bringing the various results together suggests: (1) the reporting of any sexual abuse and regular physical punishment was relatively unstable (Table 1); (2) variations in abuse reporting were unrelated to current, past or future psychiatric status; and (3) the best fitting latent class models suggested a reporting process in which false positive reports did not occur. However, a high rate of false negative responses was found in which individuals subject to sexual abuse or regular physical punishment failed to report these events (Table 2).

Below we explore the implications of these measurement properties for the assessment of the prevalence of sexual abuse/punishment and the associations of sexual abuse/punishment with measures of psychiatric adjustment in adolescence and young adulthood. Table 3 shows the estimates of the prevalence of sexual abuse/punishment for males, females and the total sample derived from different methods of estimation. The methods used included: self-reports of abuse at ages 18 and 21; a composite estimate based on whether sexual abuse/punishment was reported at either age 18 or age 21; and, estimates from the latent class models reported in Table 2.


Graphic
Table 3. Estimates of abuse prevalence (%) using different assessment criteria

It is evident that there was wide variation in prevalence estimates depending on the method used to estimate prevalence. As a general rule, estimates based on the observed rate of abuse reported at a single time were approximately half to two-thirds those of estimates that took into account data gathered at both time periods and, in all cases, the latent class estimate provided the highest estimate of prevalence. For example, based on data gathered at 18 and 21 years the reported prevalence of any sexual abuse among females was between 14% to 17%, but the latent class estimate suggested that the true prevalence of sexual abuse was 30% (reflecting the fact that the observed rates were subject to substantial false negative reports.)

Tables 4 and 5 report on the relationships between reports of sexual abuse/punishment and measures of psychiatric adjustment (psychiatric disorder, suicidal behaviour) over the period from 16-21 years using different criteria to classify sexual abuse and physical punishment. The different criteria used follow those employed in Table 3 and include: estimates based on reports at ages 18 and 21; the estimate combining age 18 and 21 reports; and the latent class estimate. In all cases, the association between exposure to sexual abuse/punishment and psychiatric adjustment is assessed by the relative risk statistic and its 95% confidence interval. The relative risk statistic has the interpretation of the increase in the risk of disorder among those exposed to abuse in comparison to those not exposed to abuse under each definition. Table 4 shows the relative risk estimates for any sexual abuse, while Table 5 shows the relative risk estimates for regular physical punishment.


Graphic
Table 4. Relative risks (95% CIs) of psychiatric symptoms (16-21 years) associated with exposure to CSA under different assessment criteria

Graphic
Table 5. Relative risks (95% CIs) of psychiatric symptoms (16-21 years) associated with exposure to regular physical punishment under different assessment criteria

In contrast to the results in Table 3, showing that variations in the assessment of abuse led to wide variations in estimates of the prevalence of sexual abuse/punishment, the results in Tables 4 and 5 show that relative risk estimates remained comparatively stable for different approaches to defining abuse. For both sexual abuse and physical punishment, the analyses showed significant relationships between all abuse criteria and all measures of psychiatric adjustment. Furthermore, within the rows of the Tables there is relatively little variation in the relative risk estimates although, in general, relative risk estimates derived from the latent class models tend to be higher than for other assessment criteria. More generally, the results in Tables 3 to 5 show that different approaches to defining abuse lead to wide variations in prevalence estimates but generally quite similar estimates of relative risk.

Supplementary analyses

The above analyses were extended to examine a number of additional issues including: the sensitivity of the results to variations in the stringency of the definition of abuse used; the possibility that factors other than psychiatric status may be associated with variability in the reporting of abuse; and the possibility that the stability of relative risk estimates may vary with gender. The results of these additional analyses are described below.

Variations in the definition of abuse

The results reported above were based on simple dichotomous measures of any sexual abuse and parental use of regular physical punishment. However, as noted in the Method section, it is possible to define both sexual abuse and physical punishment using criteria of varying stringency. To examine the extent to which the results above were sensitive to variations in the stringency of the criteria used to define sexual abuse/punishment, all analyses were replicated using a range of definitions of sexual abuse and physical punishment. For sexual abuse, two definitions were considered that were more stringent than the measure of any childhood sexual abuse used above. These were: contact sexual abuse involving physical contact between the subject and the perpetrator; and, sexual abuse involving attempted or completed vaginal, oral or anal intercourse. For physical punishment, the additional measure of parental use of severe or harsh physical punishment was considered.

Replication of all analyses with these more stringent criteria produced almost identical conclusions to those drawn above. Specifically, these were as follows.

1 There was evidence of substantial instability in abuse reporting. The kappa statistic for contact sexual abuse was 0.47; for sexual abuse involving attempted/completed intercourse was 0.38; and for severe/harsh punishment was 0.40. Latent class estimates suggested high levels of false negative reporting with in the region of 50-60% of those who were abused (under a given definition) failing to report abuse at either 18 or 21 years.

2 There was no evidence to suggest that variability in abuse reports was systematically associated with, or contaminated by psychiatric status. Variability in the reporting of abuse at 18 and 21 years was generally unrelated to measures of psychiatric disorder and suicidal behaviour over the periods 16-18 years and 18-21 years.

3 While estimates of the prevalence of abuse varied considerably depending on the assessment criteria used (individual reports at ages 18, 21; a combination of 18, 21 year reports; the latent class estimate), estimates of relative risks of psychiatric disorder/suicidal behaviour remained very stable across the different assessment criteria employed.

Table 6 provides a summary of the prevalence and relative estimates obtained from these analyses. The Table shows, for each definition of abuse and each assessment criterion: the estimated prevalence of abuse; and, the median and range of the relative risk statistics for the measures of psychiatric disorder and suicidal behaviour. For comparative purposes, the prevalence and relative risk estimates from the analyses for any sexual abuse and regular physical punishment are also included in the Table. It is evident that for all definitions of abuse, estimates of prevalence based on a combination of reports obtained at 18 and 21 years were higher than those based on a single reporting occasion. In all cases, the latent class estimate of prevalence was the highest, reflecting the high levels of false negative reporting of abuse at each age. Furthermore, despite the substantial variability in prevalence estimates, the results suggest strong stability in estimates of relative risk across different assessment criteria for all definitions of abuse.


Graphic
Table 6. Summary of prevalence and relative risk estimates obtained under different definitions of abuse severity and different assessment criteria
Other factors associated with variability in abuse reports

The previous analyses suggested that the variability in abuse reports at 18 and 21 years was not associated with young people's psychiatric status at the time of reporting. However, it is possible that other factors may contribute to individual variability in abuse reporting. To examine this possibility, the analysis was extended to examine associations between variability in abuse reports at 18 and 21 years and a range of other measures of the young person's social and family background, individual characteristics and related circumstances. These factors included: measures of family social background (parental age, education, family socio-economic status, welfare dependence); measures of family functioning (parental change, inter-parental conflict, parental attachment, and parental criminality and substance use/abuse); and, individual characteristics and behaviour (gender, cognitive ability, childhood conduct problems and inattention/hyperactivity, novelty seeking, patterns of substance use in adolescence, affiliations with deviant peers). This analysis produced no evidence to suggest that variability in abuse reporting was systematically associated with any of these factors.

Gender variability in relative risk estimates

The results shown in Table 3 suggest some evidence of gender differences in the prevalence of sexual abuse. It could also be suggested that there may be gender differences in the relative risks of psychiatric adjustment problems associated with sexual or physical abuse, and that the analysis of relative risks for the total sample may mask variability in relative risk estimates between males and females. To examine this possibility the analyses in Tables 4 and 5 were extended to test for gender differences in the relative risks of psychiatric disorder or suicidal behaviours. This analysis produced no evidence to suggest that the relative risk estimates differed for males and females, suggesting that conclusions regarding the stability of relative risk estimates under different assessment criteria also applied across both gender groups.

DISCUSSION

In this study we have used data gathered over the course of a longitudinal study to examine the stability of retrospective reports of childhood sexual abuse and physical punishment obtained at the ages of 18 and 21 years. These data have been used to explore a number of issues relating to the nature of reporting errors and their consequences for conclusions based on retrospective self-report data. The major findings and implications of this analysis are described below.

The stability of reports of sexual abuse or physical punishment

A major finding of this study was the relatively poor stability of young people's reports of childhood sexual abuse or parental physical punishment, with these reports being subject to relatively high levels of instability in which approximately half of those who reported abuse/punishment at one age failed to report it at another assessment. The values of the kappa statistics were in the region of 0.45, suggesting relatively poor agreement between reports made at ages 18 and 21 years. An important issue raised by this result is clearly whether this poor agreement across reports was specific to this study, or whether it reflects a more general tendency for retrospective reports of child abuse to be unstable and unreliable. Although there have been relatively few studies of the reliability of abuse reports, available evidence tends to suggest relatively poor reporting reliability. For example, Dill et al. (1991) examined the test-retest reliability of reports of physical and sexual abuse in a sample of 92 female psychiatric patients. The reliability of abuse reports obtained at intake and in response to a structured life experiences interview tended to be low, with kappa values of 0.32 for reports of physical abuse and 0.44 for reports of sexual abuse. Using a somewhat different research design Bifulco et al. (1997) compared reports made by female siblings of their own and their sisters' exposure to parental neglect, physical abuse, and sexual abuse during childhood. They report kappa statistics of 0.52 and 0.57 for between sister agreement on reports of physical and sexual abuse respectively. Similarly, other investigations also suggest relatively poor reliability and stability of reports of childhood physical and sexual abuse (Femina et al. 1990; Martin et al. 1993; Fry et al. 1996; Johnson et al. 1999).

These findings suggest that the poor stability of retrospective reports of abuse found in this study, reflects a general tendency for abuse reporting to be of low reliability and poor stability. Furthermore, it is of interest to note that the problems associated with reporting instability are not specific to the reporting of sexual abuse, but also applied to the reporting of physical punishment in childhood. This suggests that the origins of report instability are unlikely to lie with factors that are specific to the recall of sexual abuse, but are more likely to reflect processes relating to the reporting of potentially stressful or traumatic childhood experiences in general.

The nature of reporting errors

As we noted earlier, there has been a growing literature on the origins of measurement errors in retrospective reports of childhood circumstances. Broadly speaking, this literature has presented two perspectives on the nature of measurement errors in retrospective reports. The first perspective is that errors in retrospective reports reflect the intrinsic difficulty of recalling and accurately reporting childhood experience over a prolonged period of time. This perspective implies that reports of childhood circumstances are unreliable because of difficulties in recall and reporting. The alternative perspective has emphasized that failure to recall may be influenced by selective processes that may encourage both the inhibition of traumatic memories (repression) or the construction of false or inaccurate memories. This perspective emphasizes the fact that errors of recall are systematic and may reflect an individual's psychological or psychiatric state at, or around the time of reporting.

In this study we examined this issue by considering the extent to which stability and instability in reports of abuse were related to the presence of psychiatric disorder both prior to, and following the reporting of abuse. This analysis suggested an almost uniform absence of association between the consistency of abuse reporting and measures of psychiatric adjustment. These findings appear to be generally consistent with previous research which suggests that the reporting of adverse childhood experiences is not influenced or coloured by the individual's psychiatric state at the time of reporting (Robins et al. 1985; Brewin et al. 1993; Maughan et al. 1995). A major implication of this conclusion is that although abuse reports may be subject to considerable test unreliability due to difficulties associated with abuse recall, they do not appear to be subject to invalidity or bias arising from the effects of current mental state on the reporting of abuse exposure.

The availability of the repeated measures data gathered in this study made it possible to fit latent class models to the observed report data to examine possible associations between sample members' reports of abusive childhood circumstances and their true but non-observed abuse status. When taken in conjunction with results concerning the lack of association between psychiatric status and response consistency, this analysis suggested that the reporting behaviour in this sample could be described by a relatively simple model that can be summarized by the following propositions.

1 Absence of false positive responses: those who were not subject to abuse did not make false positive responses in which they claimed to have been exposed to abuse.

2 High rates of false negative responses: those who were subject to abuse tended to give highly unreliable reports of their abuse exposure, with approximately 50% of those identified as abused failing to report abuse when questioned on a single occasion.

3 Independence of psychiatric state and reporting errors: although those subject to abuse tended to provide unreliable reports of their abuse history, reporting errors were statistically independent of the respondent's psychiatric status.

In subsequent analyses we then explored the consequences of this model of reporting error for substantive conclusions concerning: the prevalence of abuse; and, the relationship between abuse status and psychiatric state. This was achieved by examining the extent to which conclusions were influenced by different methods of abuse classification. Three approaches to abuse classification were considered including: classification on the basis of a single report; classification on the basis of combined reports; and classification on the basis of latent class analysis. Comparison of results based on these three approaches suggested that errors in the reporting of abuse had the following consequences for study estimates of prevalence and association.

Wide variability in prevalence estimates

It was clear that different approaches to estimating abuse prevalence led to wide variation in prevalence estimates. For example, for the total sample, prevalence estimates for any sexual abuse ranged from as low as 8.5% for the estimate based on reports at age 21 to as high as 18.5% for estimates derived from the latent class model. Similarly, for the total sample, estimates of the prevalence of children exposed to regular physical punishment ranged from as low as 11.3% for the estimate based on the report made at 18 to as high as 22.2% for the latent class estimate. These findings make it abundantly clear that for this sample, errors in reporting were likely to have a substantial impact on estimates of the overall prevalence of abuse. In general, estimates based on reports of abuse at both 18 and 21 years led to prevalence estimates that were approximately twice those based on a single report. This variability reflects the high false negative rates for abuse reports, with the latent class estimates and other data suggesting that approximately 50% of those exposed to abuse do not report these experiences when questioned on a single occasion.

Limited variation in estimates of relative risk

While different approaches to classifying abuse led to very wide variation in estimates of prevalence, there was far less variation in estimates of relative risk derived from different classification approaches. All approaches to classification led to similar overall conclusions about the statistical linkages between childhood sexual abuse or physical punishment and subsequent psychiatric adjustment. These findings clearly suggest that the relative risk estimates were in fact quite robust to the substantial reporting errors that were clearly evident for accounts of childhood sexual abuse and physical punishment. The reasons for this appear to be that although reports of child abuse were subject to apparently large reporting errors, these errors were uncorrelated with measures of psychiatric status, resulting in estimates of relative risk that were reasonably robust to reporting errors.

Although the present study has focused more on the consequences of reporting errors than the causes of reporting error, the findings above may have some implications for theories of the origins of reporting error. In particular, the findings that the reporting accuracy of physical abuse is similar to that of sexual abuse, coupled with findings suggesting that reporting errors are unrelated to the individual's psychological state at the time of reporting is not generally consistent with the view that under-reporting of sexual abuse arises from repressed memories. In general, these findings are more consistent with the view that the origins of reporting errors are likely to lie with normal processes of forgetting and recall and perhaps tendencies for respondents to consciously evade answering questions that they find embarrassing or stress inducing. These processes in combination, could easily result in a situation in which there is a substantial under-reporting of child abuse that is uncorrelated with the individual's psychological state at the time of reporting.

The above conclusions have several important implications for the understanding and interpretation of retrospective reports of abuse obtained on a single occasion. In general, these results suggest that the use of a single assessment may lead to a very substantial under-estimate of the true prevalence of abuse, but that estimates of the relative risk of disorder conditional on abuse exposure may prove to be quite robust to errors in the retrospective reporting of abuse. Nonetheless, the findings also highlight the desirability of examining linkages between child abuse and psychiatric disorder using longitudinal designs in which the classification of abuse is obtained from reports made on two or more occasions. The present results suggest that this approach is likely to lead to substantial improvements in the accuracy with which prevalence is estimated and may also lead to an improved estimation of relative risk or other estimates of the association between child abuse and psychiatric outcomes.

Finally, it is important to recognize some of the liabilities and limitations of the methods used in this analysis. In this paper, we have approached the issue of assessing errors of measurement in abuse reports through an analysis of the consistency of reports of abuse at ages 18 and 21 years, using patterns of consistent and inconsistent responses to construct a model of measurement error. While this approach follows methods typically used in the analysis of multiple measures of the same construct, several limitations of this method need to be acknowledged. In particular, a liability of latent variable methods is that, typically, there is no 'gold standard' against which the latent variable estimate may be validated. In turn, this property implies that the accuracy of reporting is inferred from the consistency of reporting and thus the method does not have the potential to detect errors of measurement arising from individuals who give inaccurate but consistent reports of their abuse history. This absence of a gold standard against which to validate the latent model estimates means that the validity of estimates rests with the realism of the model assumptions about the linkages between the observed reports and the respondent's true but non-observed status. Although it is possible to test the fit of a given model to the observed data, there is no direct method of testing the realism of the model assumptions. It should also be born in mind that the estimates obtained in this sample describe the reporting errors made by young adults in describing their childhood experiences and that these estimates may not generalize to other populations.

A further caveat that should be borne in mind is that data in this study have been collected by a questionnaire method in which the identification of sexual abuse was based, primarily, on responses to 15 screening items concerning unwanted sexual activity during childhood. It is possible that in part the inconsistencies in reporting may reflect possible liabilities of this method of assessing child abuse and that alternative methodologies may lead to different estimates of stability. More generally, it would be of interest to conduct research to examine the ways in which different approaches to data collection (e.g. face to face interviews, phone interviews, computerized interview) and differing interviewing strategies (e.g. structured questionnaire, clinical interviews) lead to differences in the stability of child abuse reports.

This research was funded by grants from the Health Research Council of New Zealand, the National Child Health Research Foundation, the Canterbury Medical Research Foundation and the New Zealand Lottery Grants Board.

APPENDIX 1: LATENT CLASS MODELS

In this paper we report estimates derived from latent class models of abuse reports at ages 18 and 21 (see Tables 2 to 6). The ways in which these estimates were obtained are described below.

Prevalence estimates and false negative reporting errors

The latent class estimates of prevalence and false negative reporting errors were obtained by fitting a latent class model to the 2×2×2 distribution of gender and abuse reports at ages 18 and 21. Separate models were fitted for each abuse definition. Let X1, X2 represent the observed reports of abuse at ages 18, 21 respectively and suppose these variables are scored 1 if the subject reported abuse at a given time and 2 if the subject did not report abuse. Similarly, let A represent the subject's true but non-observed abuse status, taking values 1 if the subject was abused and 2 if the subject was not abused. The latent class model assumed that the linkages between the observed reports X1, X2 and the subject's true abuse status A were described by the following relationships: EQUATION 1 where Pr (Xi = 1 | A = 1) denotes the probability that a subject who was abused (A = 1) would report abuse at the ith measurement period (i = 1, 2) and Pr (Xi = 1 | A = 2) denotes the probability that a subject who was not abused (A = 2) would report abuse at the ith period (i = 1, 2). The model assumed that the same parameters aij described male and female reports but permitted the rate of abuse to vary with gender. Thus, EQUATION 2 where M, F denote male and female respectively.


Graphic
Equation 1

Graphic
Equation 2

This model can be shown to be exactly identified and estimates of the model parameters aij, p1, p2 can be obtained by maximum likelihood estimation methods. In the present analysis estimates were obtained using PANMARK (Van de Pol et al. 1991). The model parameters have the following interpretation.

1 The parameters a11, a21 represent the true positive rate of abuse reporting at ages 18, 21 respectively. Conversely, since the observed reports are dichotomous, the false negative rates of reporting are given by 1-a11, 1-a21.

2 The parameters a12, a22 represent the false positive rate of abuse reporting at ages 18, 21 respectively. Conversely, the true negative rates of reporting are given by 1-a12, 1-a22.

3 The prevalence of abuse among males is p1; and, p2 is the prevalence of abuse among females.

The results of preliminary model fitting showed that the model above could be simplified by imposing two constraints on model parameters without appreciably altering model fit. These constraints were: (i) the parameters a12; a22 were fixed to zero (this constraint implies that false positive responding did not occur); (ii) the parameters a11, a21 were set equal to each other (when taken into conjunction with the preceding constraint this condition implies that the accuracy of reporting at ages 18 and 21 was the same).

The Table below shows, for each definition of sexual and physical abuse, the likelihood ratio chi-squared goodness-of-fit of the constrained model in which a12, a22 were fixed to zero and a11, a21 were set equal.


Graphic
Table. No caption available.
Latent class estimates of relative risk

The latent class estimates of relative risk shown in Tables 4 to 6 were obtained as follows. For each measure of psychiatric adjustment and each definition of abuse, the observed data comprised the 2×2×2 table of the outcome (case/non-case) and the observed abuse measures at ages 18 and 21. Let X1, X2, A be defined as above and let X3 represent the outcome measure taking values 1 if the outcome was present and 2 if the outcome was absent. The parameters of the general latent class model fitted to the data were: EQUATION 3. This model was also subject to the same constraints as previously: a11 = a21; a12 = a22 = 0. The parameter a31 is the probability of observing the outcome among the abused, and a32 is the probability of observing the outcome among the non-abused. Thus, the latent class estimate of the outcome relative risk is given by the ratio a31/a32 estimated from the fitted model. [Context Link]


Graphic
Equation 3
REFERENCES

American Psychiatric Association (1994). Diagnostic and Statistical Manual for Mental Disorders, 4th edn. APA: Washington, DC.

Ammerman, R. T. (1998). Methodological issues in child maltreatment research. In Handbook of Child Abuse Research and Treatment (ed. J. R. Lutzker), pp. 117-132. Plenum Press: New York.

Anderson, J. C., Martin, J. L., Mullen, P.-E., Romans, S. E. & Herbison, P. (1993). The prevalence of childhood sexual abuse experiences in a community sample of women. Journal of the American Academy of Child and Adolescent Psychiatry 32, 911-919.

Bifulco, A., Brown, G. W., Lillie, A. & Jarvis, J. (1997). Memories of childhood neglect and abuse: corroboration in a series of sisters. Journal of Child Psychology and Psychiatry 38, 365-374.

Brewin, C. R., Andrews, B. & Gotlib, I. H. (1993). Psychopathology and early experience: a reappraisal of retrospective reports. Psychological Bulletin 113, 82-89.

Briere, J. (1992). Methodological issues in the study of sexual abuse effects. Journal of Consulting and Clinical Psychology 60, 196-203.

Briere, J. & Conte, J. (1993). Self-reported amnesia for abuse in adults molested as children. Journal of Traumatic Stress 6, 21-31.

Burgess, R. L. & Conger, R. D. (1978). Family interaction in abusive, neglectful, and normal families. Child Development 49, 1163-1173.

Clogg, C. C. (1995). Latent class models. In Handbook of Statistical Modelling for Social and Behavioral Sciences (ed. G. Arminger, C. C. Clogg and M. E. Sobel), pp. 311-359. Plenum Press: New York.

Dill, D. L., Chu, J. A., Grob, M. C. & Eisen, S. V. (1991). The reliability of abuse history reports: a comparison of two inquiry formats. Comprehensive Psychiatry 32, 166-169.

Elliott, D. M. & Briere, J. (1995). Posttraumatic stress associated with delayed recall of sexual abuse: a general population study. Journal of Traumatic Stress 8, 629-647.

Elliott, D. S. & Huizinga, D. (1989). Improving self-reported measures of delinquency. In Cross-National Research in Self-Reported Crime and Delinquency (ed. M. W. Klein), pp. 155-186. Kluwer Academic Publisher: Dordrecht.

Femina, D. D., Yeager, C. A. & Lewis, D. O. (1990). Child abuse: adolescent records v. adult recall. Child Abuse and Neglect 14, 227-231.

Fergusson, D. M. & Lloyd, M. (1991). Smoking during pregnancy and effects on child cognitive ability from the ages of 8 to 12 years. Paediatric and Perinatal Epidemiology 5, 189-200.

Fergusson, D. M. & Lynskey, M. T. (1997). Physical punishment/maltreatment during childhood and adjustment in young adulthood. Child Abuse and Neglect 21, 617-630.

Fergusson, D. M. & Mullen, P. E. (1999). Childhood Sexual Abuse - An Evidence Based Perspective. Sage: London.

Fergusson, D. M., Horwood, L. J., Shannon, F. T. & Lawton, J. M. (1989). The Christchurch Child Development Study: a review of epidemiological findings. Paediatric and Perinatal Epidemiology 3, 278-301.

Fergusson, D. M., Horwood, L. J. & Lynskey, M. T. (1996a). Childhood sexual abuse and psychiatric disorders in young adulthood: Part II: Psychiatric outcomes of sexual abuse. Journal of the American Academy of Child and Adolescent Psychiatry 35, 1365-1374.

Fergusson, D. M., Lynskey, M. T. & Horwood, L. J. (1996b). Childhood sexual abuse and psychiatric disorders in young adulthood: Part I: The prevalence of sexual abuse and the factors associated with sexual abuse. Journal of the American Academy of Child and Adolescent Psychiatry 35, 1355-1364.

Fergusson, D. M., Horwood, L. J. & Lynskey, M. T. (1997). Early dentine lead levels and educational outcomes at 18 years. Journal of Child Psychology and Psychiatry 38, 471-478.

Finkelhor, D. (1994). Answers to important questions about the scope and nature of child sexual abuse. The Future of Children: Sexual Abuse of Children 4, 31-53.

Fleming, J. M. (1997). Prevalence of childhood sexual abuse in community sample of Australian women. Medical Journal of Australia 166, 65-68.

Fry, R. P. W., Rozewicz, L. M. & Crisp, A. H. (1996). Interviewing for sexual abuse: reliability and effect of interviewer gender. Child Abuse and Neglect 20, 725-729.

Goodman, L. A. (1974a). The analysis of systems of qualitative variables when some of the variables are unobservable. Part I. A modified latent structure approach. American Journal of Sociology 79, 1179-1259.

Goodman, L. A. (1974b). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61, 215-231.

Grove, W. M. & Andreason, N. C. (1982). Simultaneous tests of many hypotheses in exploratory research. Journal of Nervous and Mental Disease 170, 3-8.

Harvey, M. R. & Herman, J. L. (1994). Amnesia, partial amnesia and delayed recall among adult survivors of childhood trauma. Consciousness and Cognition 3, 295-306.

Herman, J. L. & Schatzow, E. (1987). Recovery and verification of memories of childhood sexual trauma. Psychoanalytic Psychology 4, 1-14.

Holmes, D. S. (1990). The evidence for repression: an examination of sixty years of research. In Repression and Dissociation: Implications for Personality Theory, Psychopathology and Health (ed. J. L. Singer), pp. 85-102. University of Chicago Press: Chicago.

Johnson, J. G., Cohen, P., Brown, J., Smailes, E. M. & Bernstein, D. P. (1999). Childhood maltreatment increases risk for personality disorders during early adulthood. Archives of General Psychiatry 56, 600-606.

Lewinsohn, P. M. & Rosenbaum, M. (1987). Recall of parental behavior by acute depressives, remitted depressives, and non-depressives. Journal of Personality and Social Psychology 52, 611-619.

Loftus, E., Joslyn, S. & Polage, D. (1998). Repression: a mistaken impression? Development and Psychopathology 10, 781-792.

Martin, J., Anderson, J., Romans, S., Mullen, P. & O'Shea, M. (1993). Asking about child sexual abuse: methodological implications of a two stage survey. Child Abuse and Neglect 17, 383-392.

Maughan, B. & Rutter, M. (1997). Retrospective reporting of childhood adversity: Issues in assessing long-term recall. Journal of Personality Disorders 11, 19-33.

Maughan, N., Pickles, A. & Quinton, D. (1995). Parental hostility, childhood behaviour and adult social functioning. In Coercion and Punishment in Long Term Perspectives (ed. J. McCord), pp. 34-58. Cambridge University Press: New York.

Melchert, T. P. & Parker, R. L. (1997). Different forms of childhood abuse and memory. Child Abuse & Neglect 21, 125-135.

Memon, A. & Young, M. (1997). Desperately seeking evidence: the recovered memory debate. Legal and Criminological Psychology 2, 131-154.

Olio, K. A. & Cornell, W. F. (1994). Making meaning not monsters: reflections on the delayed memory controversy. Journal of Child Sexual Abuse 3, 77-94.

Penfold, P. S. (1996). The repressed memory controversy: is there middle ground? Canadian Medical Association Journal 155, 647-653.

Pope, H. G. & Hudson, J. I. (1995). Can memories of childhood sexual abuse be repressed? Psychological Medicine 25, 121-126.

Robins, L. N., Schoenberg, S. P., Holmes, S. J., Ratcliff, K. S., Benham, A. & Works, J. (1985). Early home environment and retrospective recall: a test for concordance between siblings with and without psychiatric disorders. American Journal of Orthopsychiatry 55, 27-41.

Trickett, P. K. & Susman, E. J. (1988). Parental perceptions of child-rearing practices in physically abusive and nonabusive families. Developmental Psychology 24, 270-276.

Van de Pol, F., Langeheine, R. & de Jong, W. (1991). Panmark User Manual: Panel Analysis Using Markov Chains Version 2.2. Netherlands Central Bureau of Statistics: Voorburg.

Widom, C. S. (1989). Does violence beget violence? A critical examination of the literature. Psychological Bulletin 106, 3-28.

Widom, C. S. (1997). Child abuse, neglect, and witnessing violence. In Handbook of Antisocial Behavior (ed. D. M. Stoff, J. Breiling and J. D. Maser), pp. 159-170. John Wiley & Sons Inc: New York.

Widom, C. S. & Morris, S. (1997). Accuracy of adult recollections of childhood victimization: Part 2. Childhood sexual abuse. Psychological Assessment 9, 34-46.

Widom, C. S. & Shepard, R. L. (1996). Accuracy of adult recollection of childhood victimization: Part I. Childhood physical abuse. Psychological Assessment 8, 412-421.

Williams, L. M. (1994). Recall of childhood trauma: a prospective study of women's memories of child sexual abuse. Journal of Consulting and Clinical Psychology 62, 1167-1176.

World Health Organization (1993). Composite International Diagnostic Interview (CIDI) World Health Organization: Geneva.