Internet Speech Perception Labs
(ISPL)

Developed with PHP, MySQL, and Java WS
Joint Research of The University of Tokyo Marine Science & Technology and The City College of New York

 

Project Objectives

This project is to realize Internet Speech Perception Labs (ISPL), an experimental academic tool that will be installed over the Internet in order to train the communication skills of the people who speak English as a secondary language; collaborative research work with Prof Takagi and Prof Uchida in The University of Tokyo Marine Science and Technology.

ISPL works like a broker application, in that the system allows the administrative user (exam provider) to upload speech files and exam formats to generate set of exams and to collect test takers' responses via the Internet. The test taker (student) can activate ISPL through conventional Web browsers, take exams, and receive scores and result statistics immediately after completing the exams. On a typical exam session of speech perception, a set of stimuli are presented one by one by the system, and test takers who listen to each of such stimuli will respond by choosing from possible alternatives. The system will have two streams of user interface: one for test providers to administer exams and another for the test takers to participate the exam sessions for training. Read carefully the following properties to understand a variation of exam format.

User classes

ISPL system has three classes of users, i.e., provider, student, and administrator. An exam is owned by a provider (creator of the exam) identified by 8 printable 'userid' characters unique in the system, and similarly, an exam has 16 character name, unique within each exam creator. The Web interface allows exam providers to upload sound stimuli after initial registration. The exam provider needs to type in her/his userid to login and activate the uploading process. Also, the exam provider can see all the previously uploaded stimuli in the interface after the login. Each uploaded stimulus will be stored as a BLOB (Binary Large Object) along with a label explaining the sound (e.g., a "burp" sound be labeled as "burp"). The Web interface also allows to view and update exam instances. The exam can be selected from the list after login, and be viewed as a sequence of trials, such as an instruction to choose answer, answer selections with appropriate labels, and an expected answer.

Student also needs registration. Registered student can find all the exams in the system, download an interface to take exams, and view result and summary statistics of taken exams. For security reasons, administrator account is not open to public.

Exam Framework

There are two types of exam: performance measurement and training. The former does not have any immediate feedback on the performance of the test takers, while the latter provides immediate feedback. In both cases, there are three types of exam, characterized by the number (either one, two, or three) of stimulus. One exam consists of a sequence of blocks, in which a block consists of a sequence of 10 trials. A trial is the actual Q/A deployed on the Web browser. For instance, perf-2-3 would express a performance, two stimuli, three block exam, having 30 trials in total.
  • Training exam: The system provides immediate feedback on each trial. If the answer is correct, the screen shows correct status (or a green signal or anything) to continue subsequent trial. Otherwise, the screen shows incorrect status and forces the test-taker to repeat the process until the right answer is selected. This feedback is not the case of performance measurement.
  • Order control: The order of the 10 trials in a block can be set either randomized or not. A randomized option produces 10 trial sequence based on a random permutation selected from a specified set of trials (a size of the specified set may be larger than 10). If not randomized, the first 10 trials as in the specified order will be selected.
  • Timing control: There are four parameters to adjust the timing of the exam: inter stimulus interval (ISI), inter trial interval (ITI), inter block interval (IBI), and rest break point. The first three items are specified in the unit of second, and the forth item is a number. For instance, 1-5-10-2 indicates that the pause between stimulus is 1 second, the pause between trials is 5 seconds, the pause between blocks is 10 seconds, and the exam pauses every two blocks until the test taker pushes a resume button.
  • Answer format: For the one stimulus trial, the response alternatives are only two, which must be defined by the test provider. For the two stimuli trial, the answer format is either a choice of "identical" or "different," or a choice of "first" or "second." For the three stimuli trial, the answer format is a choice of "first," "second," or "third", or an ASCII text box. The examples shown below will explain in more detail.

Example of one stimulus exam

Consider a perf-1-3 exam representing a typical [r/l] sound perception identification experiment, such as right/light, row/low, read/lead, ray/lay, road/load, etc. These are recorded by a native English speaker, say in the alphabetical order of {1 la, 2 lead, 3 light, 4 load, 5 low, 6 ray, 7 read, 8 right, 9 road, 10 row}. Suppose that a randomize option is chosen. Then, a randomized permutation (like 7, 8, 3, 10, 1, 6, 9, 4, 2, 5) becomes a sequence of trials in one block. Two more blocks are generated in the similar manner. Notice that the word selections are made out of a larger collection of recorded instances. The record registration process is unrelated to this assignment, and thus ignore for now. Notice also that no word will be presented twice in the trial sequence with the randomized option. On each trial, the test taker will indicate, by clicking one of the buttons on the screen, labeled "L" and "R", specified by the exam provider as an answer format.

Example of two stimuli exam

The answer of the two stimuli exam is to indicate whether two stimuli are "identical" or "different." For instance, "identical" on hearing right-right or light-light pair, and "different" on hearing right-light or light-right pair. In another experiment, the two stimuli presented are always different, and the answer must indicate whether the "first" or "second" stimulus contains the right sound. For instance, the answer to indicate which stimulus has an [r] sound should be "first" on on hearing "right-light" pair. Randomize option affects the order of pairs (but not the order of stimuli in a pair).

Example of three stimuli exam

The test taker is supposed to indicate which one of the triple should be excluded. Of the three response buttons, "first", "second", and "third," the correct response is "first," when the stimuli presented are right-light-light. The responses are made by hitting one of the buttons labeled with names on them. In another experiment, answer requires an ASCII type response. For example, for each word the test takers hear, they need to spell out each word. Randomize option affects the order of triples (but not the order of stimuli in a triple).