E.P. Brandon
Archived in ERIC Documentation Service, ED 315 330, 1990
Note: Tables 5, 7, 12, and the Appendix are stored as separate files; if your browser is unable to use the javascript access provided in the text you may get these files here: table 5, table 7, table 12, appendix.
This paper reports data on some aspects of the deductive logical competence of non-graduate teachers in the English-speaking Caribbean, primarily Jamaica. It is intended to complement and extend the work reported in Nolan and Brandon [10] on levels of competence among secondary school children in Jamaica on various types of conditional reasoning.
The teachers sampled were those who sat entrance examinations for the UWI, Mona, Faculty of Education B.Ed. and Cert. Ed. programmes from 1985 to 1990. These people are teachers who have been trained in regional teacher training colleges and are still working in the school system of their territory. At the time of sitting the examination, the majority are working in the primary sector, though many of those applying for training in specific secondary subjects are already working in the secondary sector, and many successful applicants will return to that sector, or even to teacher training itself.
Given that none of the territories can boast a large graduate teaching force, these applicants are not untypical of teachers in their context, in particular they are fairly representative of the sort of teachers the majority of students are likely to meet. They are perhaps mainly unusual in that they are trying to improve their lot through further teacher training. But while in very general terms the teachers sampled are not unusual, they do not constitute good samples when broken down by such factors as sex. In this particular case, males dominate in administrative positions and are found mostly in the secondary sector, and it is from such backgrounds that the vast majority of the male applicants come. The data are almost certainly skewed also with respect to geographical distribution since rural isolation or sheer distance from the Mona campus is probably enough to deter many teachers. In this respect it should also be noted that the representativeness of these applicants is much stronger for Jamaica than for the other territories whose teachers might in most cases seek to apply to other campuses of the UWI.
Tables 1 and 2 give a breakdown of the six batches of teachers. It is possible that the same person can sit the entrance examination on different occasions, and it is known that several of those persons applying to study B.Ed. special subjects had already taken the examination when previously applying for the Certificate programme. No attempt has been made to identify such repeaters.
In the course of this report mention will sometimes be made of other related investigations. These include groups of Jamaican school children who were given versions of the tests used in the entrance examination and also school children and a group of teacher trainees in St Lucia who were given a test which was mostly concerned with different principles.
Table 1: Applicants, by programme, sex and territory, 1985-90
| 1985 | 1986 | 1987 | ||||||||||
| N | 622 | 472 | 361 | |||||||||
| B.Ed. | Cert. | B.Ed. | Cert. | B.Ed. | Cert. | |||||||
| N | 247 | 375 | 179 | 293 | 224 | 137 | ||||||
| M | F | M | F | M | F | M | F | M | F | M | F | |
| 57 | 188 | 67 | 308 | 46 | 132 | 69 | 224 | 43 | 181 | 32 | 105 | |
| Anguilla | - | - | - | - | - | - | - | - | 1 | - | - | - |
| Antigua | 1 | - | 1 | 8 | - | 1 | 1 | 8 | 1 | 2 | 1 | 4 |
| Bahamas | - | 1 | - | - | - | - | - | - | - | - | - | 1 |
| Barbados | - | - | 2 | 9 | - | - | 4 | 10 | - | - | - | 6 |
| Belize | 2 | 2 | 1 | 3 | 3 | 6 | - | 1 | 2 | 3 | 1 | 1 |
| C.O.B. | 6 | 27 | - | - | 3 | 18 | - | - | 3 | 18 | - | - |
| Dominica | - | - | - | - | - | - | 3 | 12 | - | 1 | 2 | 13 |
| Grenada | 1 | 1 | - | - | - | - | - | - | 1 | 2 | - | 1 |
| Guyana | - | - | - | - | - | - | - | - | - | 1 | - | - |
| Jamaica | 38 | 149 | 46 | 270 | 35 | 98 | 30 | 148 | 31 | 152 | 21 | 69 |
| St Kitts | 5 | - | - | 3 | - | - | - | - | - | 1 | - | - |
| St Lucia | 2 | 4 | 7 | 10 | 4 | 2 | 4 | 11 | 2 | 1 | 3 | 5 |
| St Vincent | - | 1 | - | - | - | - | - | - | 1 | - | 1 | - |
| Trinidad | 2 | 3 | 10 | 3 | 1 | 7 | 27 | 34 | 1 | - | 3 | 5 |
| Other | - | - | - | - | - | - | - | - | - | - | - | - |
| 1988 | 1989 | 1990 | ||||||||||
| N | 594 | 622 | 537 | |||||||||
| B.Ed. | Cert. | B.Ed. | Cert. | B.Ed. | Cert. | |||||||
| N | 389 | 205 | 385 | 237 | 341 | 196 | ||||||
| M | F | M | F | M | F | M | F | M | F | M | F | |
| 84 | 305 | 32 | 173 | 78 | 307 | 37 | 200 | 81 | 260 | 33 | 163 | |
| Anguilla | - | - | - | - | - | - | - | - | - | - | - | - |
| Antigua | - | - | 2 | 17 | - | - | 1 | 34 | - | 3 | 5 | 17 |
| Bahamas | - | - | - | - | - | - | - | - | 1 | 2 | 1 | 1 |
| Barbados | - | - | 1 | 2 | - | 2 | 2 | 7 | - | - | 2 | 14 |
| Belize | 2 | 1 | - | - | 3 | 1 | 1 | 1 | 1 | 1 | - | 1 |
| C.O.B. | 5 | 19 | - | - | 7 | 29 | - | - | 9 | 16 | - | - |
| Dominica | - | 2 | - | 7 | - | 1 | 1 | 5 | - | - | - | - |
| Grenada | - | - | 3 | 4 | 1 | - | 1 | 2 | - | 3 | 3 | 4 |
| Guyana | - | - | - | - | - | - | - | - | - | - | - | - |
| Jamaica | 72 | 276 | 22 | 127 | 63 | 269 | 24 | 139 | 62 | 222 | 13 | 82 |
| St Kitts | 4 | - | - | - | - | - | - | - | 4 | 1 | - | 2 |
| St Lucia | 1 | 3 | 1 | 15 | 4 | 2 | 3 | 9 | 3 | 5 | 2 | 17 |
| St Vincent | - | - | - | - | - | - | - | - | 1 | 1 | 2 | 8 |
| Trinidad | - | 2 | 3 | 1 | - | 3 | 4 | 3 | - | 4 | 5 | 16 |
| Other | - | 2 | - | - | - | - | - | - | - | 2 | - | 1 |
Note: in 1985, two B.Ed. candidates did not give their sex; similarly in 1986, one candidate. C.O.B. stands for the College of the Bahamas; candidates classified as "other" are probably from Turks and Caicos, except in 1990 when the Cert. person was from the Cayman Islands, one B.Ed. from Turks and Caicos and one from Montserrat.
Table 2: Applicants, by specific programme, 1985-90
| 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | |
| B.Ed. total | 247 | 179 | 224 | 389 | 385 | 341 |
| Administration | - | - | 57 | 111 | 96 | 102 |
| Primary | - | - | 58 | 122 | 106 | 95 |
| English | - | - | 23 | 32 | 39 | 20 |
| Int. Science | - | - | 7 | 13 | 20 | 20 |
| Maths | - | - | 16 | 31 | 30 | 20 |
| Reading | - | - | 9 | 7 | 10 | 18 |
| Soc. Studies | - | - | - | 21 | 14 | 18 |
| Spanish | - | - | 1 | 1 | 2 | 1 |
| Special Ed. | - | - | 23 | 27 | 32 | 22 |
| C.O.B. | 33 | 22 | 21 | 24 | 36 | 25 |
| Cert. Ed. total | 375 | 293 | 137 | 205 | 237 | 196 |
| Administration | - | - | 9 | 55 | 18 | 13 |
| English | 53 | 35 | 22 | 27 | 29 | 20 |
| Int. Science | 40 | 33 | 11 | 7 | 22 | 14 |
| Maths | 59 | 62 | 30 | 40 | 46 | 40 |
| Reading | 200 | 134 | 56 | 52 | 83 | 69 |
| Religious Ed. | - | 15 | 4 | 3 | 5 | 2 |
| Soc. Studies | 20 | 11 | 5 | 20 | 33 | 38 |
| Spanish | 3 | 3 | - | 1 | 1 | - |
Note: B.Ed. candidates were not classified by specialization until 1987.
Deductive logic studies those situations, typically but not only in a context of argument, in which a given set of statements (possibly empty), the premises, necessitates another set, the conclusion, that is to say, if the premises were true then the conclusion would have to be true too. An argument in which this happens (or more generally, the inference in any such case) is labelled valid. Very many such situations depend only on the structure of the statements involved and not upon their semantic content, so that deductive logic can be studied as a matter of form: the argument every A is a B, x is an A, so x is a B is deductively valid, no matter what one is talking about (humans, mortality and Socrates in the stock example, or equally well whales, mammals, and the whale stranded on your local beach). Just as one can identify valid forms of argument so one can pick out argument structures that do not fit the criterion (in which the premises could all be true while the conclusion is false), although they might seem to be valid. Such structures are deductively invalid arguments; many of what are called fallacies exemplify such deductively invalid principles of reasoning.
The investigations reported here have been based on the extensive work carried out in the US by Robert Ennis and his associates on competence in simple deductive logic (Ennis and Paulus [5]). In particular their approach to the identification of distinct principles of valid or invalid argument has been retained. The example of valid argument given in the previous paragraph would then count as one (symbolically expressed) case of a particular principle, other cases being given by replacing the symbols with words uniformly through the argument.
In Ennis' tests, and in these investigations, six examples of each such formally identified logical principle are given. Respondents are held to have mastered a principle if they get at least five of these questions right; they are considered on the borderline if they get exactly four of them right. It should be noted that on this approach a person may be considered to have mastered a valid principle, say, even though he has not grasped how it differs from a similar but invalid one. As has been noted elsewhere (Brandon [2]), a stricter view would drastically cut the number of persons who could be regarded as having mastered any logical principle at all.1
Logical principles can be grouped in various ways. The first and crucial criterion is validity or invalidity. The other main way is by reference to their major structural components. Thus one can distinguish various principles focussing on conditional statements (such as all the principles used in Nolan's pioneering work in Jamaica, Nolan [9], and in the report mentioned above, Nolan and Brandon [10]) or quantifiers (words for how many, all, some, most, etc.) or disjunctive statements, and so on.
An interesting and unusual feature of the investigations reported here is that principles involving the plurative or pleonetetic (Geach's terms [6]) quantifier most have often been used. This quantifier is not normally studied in elementary formal logic, even though its meaning is perhaps closer to what we often intend in using plurals or the universal quantifier all than how that universal quantifier is itself construed in formal logic (cf. Hodges [7], p. 196).
A couple of other "principles" have been used in the investigations, although they do not fit the characterization given above. One such was a set of items depending on general properties of relational expressions, the other on simple mathematical relationships. In both cases, the set of six items was composed of three valid cases and three invalid, so these "principles" have been treated very differently from the rest.
Each entrance examination has contained a paper with a section consisting of 36 items in the standard Ennis format: suppose you know that .... then would it be true that ....? Three possible answers are offered: Yes, glossed as it must be true, given what you are told; No, it can't be true, given what you are told; and Maybe, it may be true or it may be false, you haven't been told enough to be certain whether it is Yes or No.
In this format, the correct answer for all invalid principles is Maybe. A few, perhaps over-cautious respondents give this answer to virtually all questions, and so automatically tend to achieve mastery of invalid principles. A related quibble with the format is that one's immediate reaction to an invalid principle might be "no", meaning "not necessarily." It is difficult to check whether in fact this has had any serious impact on the results. [But see now my 1992.] The detailed item analyses given in the Appendix suggest that there may be some such effect since 1986 as the numbers giving the answer "No" to questions using invalid principles where the expected incorrect answer is "Yes" are much higher than those answering "Yes" where the expected incorrect answer is "No" (compare q3, q7, q17 with q12 and q20). But on valid principles too there are often large numbers offering the unexpected wrong answer.
As noted already, the 36 items are grouped into six principles of six items each. Table 3 lists the principles employed in the six tests and gives the formal structure for each principle.
Table 3: Principles Tested in Each Test
| 1985 [Test B(E)] | MODPON | MODTOL | HYPSYL | DENCON | AFFCON | QAFCON |
| 1986 [Test C(E)] | REL* | MODTOL | MVAL | DENANT | NUM* | MINVAL |
| 1987/89 [Test C(E)2] | MODPON | MODTOL | MVAL | DENANT | NUM* | MINVAL |
| 1990 [Test C(E)3] | MODPON | MODTOL | DISSYL | UMMOST | NONONO | MINVAL |
Notes: * These "principles" consist of three valid and three invalid items each and are not identified in a purely formal manner. The other principles are formally as follows:
- MODPON - [valid] if p then q, p, so q
- MODTOL - [valid] if p then q, not q, so not p
- HYPSYL - [valid] all A are B, all B are C, so all A are C
- DENCON - [valid] all A are B, this is not B, so this is not A
- DENANT - [invalid] if p then q, not p, so not q
- AFFCON - [invalid] if p then q, q, so p
- QAFCON - [invalid] all A are B, this is B, so this is A
- MVAL - [valid] most A are B, all B are C, so most A are C
- MINVAL - [invalid] most A are B, most B are C, so most A are C
- DISSYL - [valid] X is either Y or Z, X is not Y, so X is Z
- UMMOST - [invalid] most A are B, all C are A, so some C are B
- NONONO - [invalid] no A are B, no B are C, so no A are C.
The three valid REL items will be labelled RV; they are based on the transitivity of certain relations. The three invalid REL items will be labelled RI. Similarly the NUM items will be divided into MATHV (or MV) and MATHIN (or MI). As noted later in the text, items in 1985 were also classified into symbolic (SYMBOL or SYM) and suggestive (SUGG) groups. In 1990 some items were classified as suggestive (3 items) and LNEG (7 items) - where the question sentence is grammatically negative.
As should be obvious from the preceding, the items used to test a principle have been identified structurally (or semantically for the two deviant principles). No suggestion is involved that any ordinary person would respond to such structure, although one might hope that a student of formal logic would do so. One could say that the items have been chosen normatively, from the perspective of formal logic, rather than with an eye to uncovering the ways in which untutored people handle such problems.2 There is then no particular interest or importance in conventional measures of reliability for such tests; they indicate rather the extent to which people fail to utilize what might seem the most appropriate way of handling these problems. The Appendix contains relevant statistics for these tests, from which it can be seen that the Cronbach alphas are just about respectable, though this is due more to the number of items (and in 1985 to the greater variance of the scale due to the many who left several questions unanswered) than to any uniformity in content, as the item to scale correlations suggest. It can also be seen that such correlations tend to improve as one proceeds through the test, which again is simply a reflection of the numbers failing to finish in good time.
Each entrance examination paper also contains two other tests, one verbal (as a matter of fact, since 1986 this has involved a grasp of relations between pairs of items so is more than a test of vocabulary), the other usually of spatial ability, but in 1989 and 1990 of simple applied mathematics. (For purposes of comparisons between different years it should be noted that the verbal sub-test was unchanged between 1986 and 1989, and was altered from a 5 choice to a 3 choice format in 1990, but with the same questions. The spatial test was the same from 1986 to 1988; the applied mathematics questions were the same in 1989 and 1990 but in 1989 answers were required while in 1990 a 3 choice format was adopted.) Candidates are told to spend half an hour on each of the three sections, but this does not always happen. The logic questions are in the third section of the paper, with the result that latecomers often omit most or all of them. For that reason, sample numbers are a little smaller than the number of actual entrants since persons failing to answer more than a couple of questions in the logic test have been excluded. Control over examination conditions was worst in 1985 - 47 entrants have been excluded and the item analyses reported in Appendix 1 show that well over 100 candidates failed to answer the last 15 questions; results will occasionally be given for 1985 based on a much smaller group of 452 who were deemed to have completed the test. But for those included in the sample, missing answers have been regarded as mistakes.
The complete entrance examination comprises another two papers for all candidates except those for the College of the Bahamas. Paper 2 is an English language paper; paper 3 was a general education paper but now relates to the specific area the candidate wishes to pursue. These papers vary considerably from year to year; results on paper 2 will be given to allow comparisons with the verbal test included in paper 1.
The most important result from these tests is the percentage mastery of each of the logical principles. Table 4 gives the percentage of those who have mastered and are on the borderline for each principle in each test. The percentage of those lacking mastery is easily calculated.
Table 4: Percentage Mastery and Borderline for Principles
| 1985 (N = 622) | ||||||||
| MODPON | MODTOL | HYPSYL | DENCON | AFFCON | QAFCON | (SYM | SUG) | |
| Mastery | 41 | 27 | 55 | 36 | 2 | 5 | 7 | 9 |
| Borderline | 21 | 18 | 20 | 24 | 4 | 6 | 20 | 29 |
| 1986 (N = 472) | ||||||||||
| REL* | MODTOL | MVAL | DENANT | NUM* | MINVAL | (RV | RI | MV | MI) | |
| Mastery | 30 | 49 | 60 | 5 | 35 | 8 | 45 | 7 | 25 | 35 |
| Borderline | 28 | 20 | 24 | 8 | 31 | 8 | 38 | 42 | 39 | 40 |
| 1987 (N = 361) | ||||||||
| MODPON | MODTOL | MVAL | DENANT | NUM* | MINVAL | (MV | MI) | |
| Mastery | 66 | 38 | 58 | 5 | 33 | 9 | 18 | 38 |
| Borderline | 15 | 20 | 22 | 12 | 24 | 11 | 39 | 37 |
| 1988 (N = 594) | ||||||||
| MODPON | MODTOL | MVAL | DENANT | NUM* | MINVAL | (MV | MI) | |
| Mastery | 67 | 42 | 57 | 4 | 27 | 9 | 17 | 34 |
| Borderline | 17 | 21 | 21 | 9 | 31 | 9 | 42 | 41 |
| 1989 (N = 622) | ||||||||
| MODPON | MODTOL | MVAL | DENANT | NUM* | MINVAL | (MV | MI) | |
| Mastery | 68 | 44 | 57 | 6 | 31 | 13 | 23 | 34 |
| Borderline | 18 | 22 | 21 | 9 | 24 | 9 | 32 | 37 |
| 1990 (N = 537) | ||||||
| MODPON | MODTOL | DISSYL | UMMOST | NONONO | MINVAL | |
| Mastery | 72 | 47 | 77 | 8 | 7 | 18 |
| Borderline | 15 | 21 | 9 | 15 | 6 | 12 |
The stability of mastery percentages is noticeable, with 1986 being a little better than the rest for valid principles. This stability might seem upset by the results for 1985 but when the smaller group of 452 "finishers" is used that year falls into line (with, for instance, scores of 53 and 25 for MODPON, and 37 and 23 for MODTOL).
The relative difficulty of the principles also remains stable, and is consistent with Ennis' findings for the US. The valid principles are ordered HYPSYL, MODPON, MVAL, DENCON and MODTOL where principles using quantifiers are easier than their conditional analogues. This is also found with the invalid principles QAFCON and AFFCON while the quantifier principle MINVAL is easier than DENANT though it has no obvious formal analogy with it. Of the principles first tested in 1990, DISSYL proved to be easier than MODPON while UMMOST and, unexpectedly, NONONO proved very difficult.
While the results are fairly stable and in accordance with expectations perhaps the most important point to be made about them is that they do not reflect a satisfactory level of elementary reasoning ability. They tell us that roughly 40% of the entrants have not mastered MODTOL, which is a fundamental reasoning tool in our investigation of the world: a long tradition, stretching at least from Lord Bacon to its most notorious contemporary exponent, Sir Karl Popper, has seen in the falsification of predictions the most powerful means we have of revising our view of the world (cf. Brandon [3], ch. 3). When this is combined with the gross inability (characterizing roughly 80% of the entrants) to recognize invalid inferences (which is to say, a liability to think one is on safe ground when one isn't), one might be forgiven for refusing to place much confidence in the role of reason in the mental economy of these teachers.
While mastery of formally invalid principles is extremely low (as it is in Ennis' US data too) it is worth noting that the respondents do considerably better with invalid items where the invalidity is more a matter of content (those items labelled RI and MI); indeed with the particular items selected here they do better on invalid items with a mathematical content than on valid ones. This may support attempts to excuse the performance on the formal principles by reference to contextual factors (cf. Brandon [1]) or in other ways, though whatever one says, the answers given by the vast majority are still wrong and reflect an inadequacy somewhere in their processing of the information given.
It is possible to inquire whether there are any interesting variations among the entrants on logical competence. But as noted earlier, such investigations can only be regarded as provisional and suggestive, given the nature of the sample. Comments will be made about variations in deductive competence between students applying for different programmes, from different territories, and finally of different gender.
Table 5 gives the mean and standard deviation of scores (with F values) on each principle and of various other variables for three groups of applicants distinguishable in terms of academic programme. Three groups are produced since Certificate applicants can be separated into those applying for the intramural programme at Mona and those applying to take the in-service programme offered on the UWI Distance Teaching Experiment's telephonic system. Anecdotal evidence suggested that these two groups might differ, as indeed Table 5 confirms. It may be added that the UWIDITE applicants include virtually all the non-Jamaican Certificate applicants, though also a good proportion of Jamaican applicants as well.
The general picture given by Table 5 is that the B.Ed. mean score lies somewhere between that for the Mona Certificate and the UWIDITE Certificate applicants. In most cases the actual differences are insignificant (though no test has been done on the persistent ranking of the three groups) and the larger differences are mostly localized to non-formal logical matters (RV, RI, and MATHV in 1986; MATHV in 1987 and almost in 1988; and MATHIN in 1989) or other variables (spatial ability in 1985, 1986, and 1988; verbal in 1986, 1987, and 1988 and the related English paper 2 in 1986 and 1988). The low scores of Mona Certificate applicants in these areas could reflect their provenance predominantly in the Jamaican primary sector, whereas the other two programmes recruit a good number from outside Jamaica and/or the non-primary sector.
Only in the 1989 and 1990 tests was a difference significant at 0.05 found in any of the purely formal logical principles (MODPON, 1990 DISSYL also) and this appears to have helped the overall logic total to show a significant difference too (though with the B.Ed. mean marginally above the UWIDITE in 1989). But with the number of tests reported in Table 5 this does not seem particularly important.
It has been suggested that the small but persistent differences found in the two Certificate groups may be related to the territorial origin of the applicants. Because numbers of non-Jamaicans are always very small the data have not been reported by territory, but the 1987 entrants have been divided a priori (that is to say, on the basis of impressionistic evidence rather than an examination of actual mean scores for the territories) into two groups, one comprising those from Jamaica, the Bahamas, Dominica, and Grenada, the other comprising the rest. This very crude division produced significant differences on the overall logic total (p = 0.03), the principles DENANT (p = 0.03) and MATHV (p = 0.00), and both the verbal test (p = 0.00) and the English paper 2 (p = 0.00). A three-way ANOVA (adding sex as well) on the verbal test (which had shown significant differences by academic programme) gave an F for programme of 15.49 (p = 0.00); for territory of 12.46 (p = 0.00); and for sex of 2.49 (p = 0.12). The biggest difference (on paper 2 where the mainly Jamaican group's mean was 6.16 (s.d. = 5.05) as against 10.54 (s.d. = 7.42) for the others) showed a significant sex/programme interaction (F = 4.23, p = 0.02) but only territory among the main effects, (F = 24.75, p = 0.00). These findings would suggest, if they can be used as a basis for generalization, that while territorial differences are playing a role, there are genuine differences also among the programmes in at least some variables.
While not reported here in detail, territorial means were also calculated for 1988 and 1989; they showed as above larger differences on the non-logical variables: in 1988, the verbal test mean ranged from 21.00 to 38.33; spatial from 15.75 to 30.00; paper 2 from 8.50 to 21.67, while the logic total ranged only from 17.33 to 28.00, valid items from 13.50 to 17.67, and invalid items from 3.83 to 10.33. In 1989 the verbal test ranged from 24.81 to 35.00; maths from 11.67 to 17.14; paper 2 from 32.19 to 47.80; logic total from 18.11 to 24.22, valid items from 13.25 to 17.64, and invalid items from 4.67 to 7.25.
The 1990 entrants were divided on a geographical basis into a northern group (Jamaica, Belize, Bahamas, Turks and Caicos, Cayman, 413 candidates in all) and the rest. While there is a large prima facie difference between these groups on the logic test, when analysed along with sex and programme this factor ceased to give a significant F value, though the main effects continued to be highly significant (see Table 6). What is perhaps most interesting is the comparison with the analyses for the verbal and mathematics tests where both sex and location play a much greater part in accounting for the variance in the results.
Table 6: Three-way ANOVAS of Sub-tests by Location, Sex, and Programme
| Sub-test: | Logic | Verbal | Mathematics | ||||
| Source of Variation | D.F. | F | prob | F | prob | F | prob |
| Main Effects | 4 | 4.11 | 0.00 | 6.33 | 0.00 | 15.47 | 0.00 |
| Location | 1 | 2.89 | 0.09 | 9.71 | 0.00 | 10.85 | 0.00 |
| Sex | 1 | 1.54 | 0.21 | 10.97 | 0.00 | 44.61 | 0.00 |
| Programme | 2 | 0.80 | 0.45 | 1.15 | 0.32 | 2.25 | 0.11 |
| 2-Factor Interactions | 5 | 1.73 | 0.13 | 1.19 | 0.31 | 0.68 | 0.64 |
| Location/Sex | 1 | 1.21 | 0.27 | 0.58 | 0.45 | 0.41 | 0.53 |
| Location/Programme | 2 | 0.85 | 0.43 | 0.39 | 0.68 | 0.12 | 0.88 |
| Sex/Programme | 2 | 1.36 | 0.27 | 2.52 | 0.08 | 1.60 | 0.20 |
Finally, and possibly most contentiously, it is possible to group respondents with respect to gender. Table 7 reports means, standard deviations, and T values for the various variables grouped by gender (male coded 1, female 2). As noted earlier, there is little likelihood that the gender sampling is as representative as the overall sampling of non-graduate teachers; males are mostly applying for administration, or certain secondary subject specialisms, and this reflects their original position within the school system.
As far as the purely logical principles go, Table 7 reveals virtually no significant differences, except that DENANT goes to the men on three occasions, while the usual female superiority on MODPON and MODTOL is significant in one year. There are, however, several occasions when the male average is significantly higher on the content-based "principles", REL (both RV and RI) and MATHV, on the spatial and mathematical tests, and on the English paper 2. This superiority in English and mathematical tests probably reflects the skewing of the male sample towards higher status parts of the school system.
Grouping together valid and invalid items allows one a simple view of the very great difference in competence which has been found in all such investigations. Table 8 reports means and standard deviations for valid and invalid items by sex, with T values, allowing a fairly strong test of the suggestion made in Nolan and Brandon [10] that there might be a slight gender-related difference here. As can be seen, the evidence of these data does give prima facie support for such a difference. In only one year is there a significant difference on valid items and the slight advantage varies between male and female from year to year, but in four out of five years there is a significant difference on the invalid items where the male group consistently scores higher. But what in fact does this prove? Table 4 shows conclusively that neither gender comes anywhere near general mastery of invalid principles; it could be argued that Table 8 simply displays a slightly greater tendency to hedge one's bets (by answering "MAYBE"), which could be a consequence of slightly higher status and the self-confidence it gives.
Table 8: Means and Standard Deviations for Valid/Invalid Items by Sex
| Valid | Invalid | |||||||
| Male | Female | Male | Female | |||||
| 1985 | mean | 14.77 | 15.48 | 3.07 | 2.81 | |||
| s.d. | 5.21 | 5.18 | 2.48 | 2.30 | ||||
| T (618) | -1.38 | 1.11 | ||||||
| p | 0.17 | 0.27 | ||||||
| 1986 | mean | 13.69 | 12.70 | 7.37 | 6.39 | |||
| s.d | 2.71 | 2.97 | 3.55 | 2.84 | ||||
| T (469) | 3.17 | 3.02 | ||||||
| p | 0.00 | 0.00 | ||||||
| 1987 | mean | 14.32 | 14.67 | 6.09 | 5.30 | |||
| s.d. | 3.67 | 3.81 | 3.11 | 2.64 | ||||
| T (359) | -0.71 | 5.30 | ||||||
| p | 0.48 | 0.03 | ||||||
| 1988 | mean | 14.36 | 14.98 | 5.89 | 5.11 | |||
| s.d. | 3.97 | 3.47 | 2.97 | 2.76 | ||||
| T (592) | -1.68 | 2.68 | ||||||
| p | 0.09 | 0.01 | ||||||
| 1989 | mean | 15.17 | 15.05 | 6.08 | 5.27 | |||
| s.d. | 4.24 | 3.64 | 3.22 | 2.95 | ||||
| T (620) | 0.31 | 2.61 | ||||||
| p | 0.76 | 0.01 | ||||||
| 1990 | mean | 14.31 | 14.09 | 6.33 | 5.87 | |||
| s.d. | 3.44 | 3.39 | 3.64 | 3.71 | ||||
| T (535) | 0.60 | 1.19 | ||||||
| p | 0.55 | 0.23 |
Besides inquiring about differences among the candidates, one can also ask about differences among the items in the tests. Ennis had attempted to gauge the importance of various content factors on competence in deductive reasoning. He had distinguished items with symbolic content (schematic letters replacing nouns), items with what he called suggestive content (typically arguments with absurd claims in which the conclusion is patently false), and items with ordinary unexceptionable content. The 1985 logic test retained his types of item (with the rest classified as ordinary); it revealed, as expected from Ennis' own work, that these kinds of content do not appear to make much difference overall, although symbolic items do seem somewhat more difficult (see Table 9) and are more often omitted (see the 1985 data in the Appendix in particular). Anecdotal evidence from teaching formal logic suggests that the kind of symbolism also makes a difference. In the entrance examination logic tests, nouns are replaced by schematic letters (so one might be told that there is an X) but in another investigation one of the items for the principle MODPON involved the replacement of whole sentences (or clauses) with schematic letters (e.g. if p then q) as is done in formal logic. As expected, this item proved much more difficult than the noun replacement equivalent, as shown in Table 10.
Table 9: Percentage Mastery and Means on Kinds of Content (1985, N = 452)
| SYMBOLIC | SUGGESTIVE | ORDINARY | |
| % Mastery | 9 | 12 | 15 |
| % Borderline | 24 | 38 | 38 |
| Mean | 2.93 | 3.36 | 3.62 |
| Standard deviation | 1.26 | 1.17 | 0.83 |
Table 10: Difficulty and Discrimination Indices for Symbolic Items among two samples
| St Lucia Trainees | Jamaica Grade 10 | |||
| Diff. | Disc. | Diff. | Disc. | |
| Noun variable | 0.91 | 0.13 | 0.87 | 0.41 |
| Sentence variable | 0.53 | 0.13 | 0.48 | 0.35 |
Note: The argument principle is MODPON, in one case phrased as if there is an X, there is a Y, etc. in the other if p then q, etc. The difficulty index is as usual really a facility index - 91% of the trainees got the name variable question right. Unpublished data.
While in general types of
content do not play a very important part, the specific content
of individual items may well make a big difference to
performance. This possibility is consistent with the
suggestion that people in general do not attend to, or at least
directly base their judgment upon an informed awareness of, the
formal structure of an argument or inference when they evaluate
it. In any case, the difficulty (or better facility) and
discrimination indices for the individual items reveal
considerable variation within each formally identical principle,
as can be seen from the item analyses in the Appendix.
The difficulty index is the mean right score for the item; the
discrimination index used is the difficulty in the bottom 27%
subtracted from the difficulty in the top 27% of the
sample. In most years the most consistent set of items is
MINVAL, where there are no distractions provided by logically
superfluous and unexpected negations in any of the items.
The 1985 test was structured to examine the relationship between
items expressed using conditionals and virtually identical items
expressed with quantifiers. In modern formal logic the
quantified expressions are in fact seen as containing
conditionals, so all A are B is construed as for all x,
if x is A then x is B. The questions testing MODTOL
were paired with those for DENCON while three of those for AFFCON
were paired with QAFCON. Table 11 gives mean scores and
inter-item correlations on these sets of items. While the
inter-item correlations are not particularly impressive, they are
higher than such correlations between unpaired items, even within
the same principle: the virtual identity of content seems to be
playing a role, whereas the structural parallels exploited by the
theory of modern logic do not seem to impinge.
Table 11: Mean scores and inter-item correlations on paired items
| MODTOL | DENCON | corr. | AFFCON | QAFCON | corr. | ||||
| item | |||||||||