Pooling Data in the Chemistry Lab

This article considers the concept of pooling experimental data in the teaching lab. It’s a very simple but effective technique that I have used, both in low-tech and high(er)-tech versions. Some observations from my own experiences as well as some examples from the literature are presented.

What is pooling data?

Pooling data, in the sense I mean here, is the aggregation of all students lab data on a particular experiment in one lab session. If ten pairs of students are completing an experiment, the pooled data would consist of a table of all individual pairs’ data, together with averaged data. In the low-tech version, this data would be tabulated and averaged on the white board, with students taking their own results as well as the class average results for analysis in their report. In the high-tech version, the same analysis is done, but the results are entered into a spreadsheet which is uploaded onto the VLE. This also allows a living graph to be drawn during the lab session as students enter their data. The approach of course requires that all students or a significant proportion are completing the same experiment at the same time, something that might be difficult in upper level undergraduate chemistry, but should be quite feasible in years 1 and 2.

Pooling data across the lab means that students will quickly observe any differences in their experimental data

Why pool data?

The advantages of pooling lab data, from my own experience of doing it, are many. The first is that it opens up a dialogue across the group. Very often in a lab group, pairs work independently without considering that others in the lab are doing the same experiment, and hence a useful resource for comparing results. Entering the data in the public space means that students consider their results in light of others – is it around the same, how does it differ, should it be repeated? This addresses issues about confidence in experimental approach. It also allows for a discussion on the nature of errors in the lab. Differences between human errors and instrumental variance, for example, can be easily identified. The latter will result in different individual values, but the same trend (e.g. the same slope of a graph); the former will show differences in trends of data. The process also flags to students the extent of confidence they can have in their value, and really illustrates the point of variance very clearly. Pairs can discuss with each other across the group about how and why their values differ. It is a nice experience to watch a group of students standing at a board considering differences in values, rather than rushing to see if their plot gives them the “right answer”.

If a living graph approach is used, it provides a platform towards the end of the practical activity for discussion of both experimental error (which can be illustrated visually on the scatter plot containing all data), as discussed above, and for initiating explanations on how data treatment should begin. In my own laboratory, students are required to stay the full time, meaning the introduction (completed prior to the lab), procedure, data recording and the beginnings of data analysis (all completed during the lab). This means that the analysis can take place in the presence of an instructor, which lessens the black hole effect that many students see lab report writing to be. The range of data available also allows meaningful statistical analysis to be performed by students.

One final advantage of pooling data, which is a slight variation of the method described so far, is that pooling data may allow a larger scale experiment to be done – for example the construction of a phase diagram – where all pairs do some experiments that contribute to the total data set. This method is effective in the discovery laboratory teaching method, see some of the references for more details.

Examples from my practice

There is no limitation on the nature of experiments that can be used in this way, once a consistent value is to be measured by the lab group as a whole. It has best use though when used in obtaining data that is to be treated graphically. I have used it in a range of quantitative experiments based on the Beer-Lambert law, thermochemistry experiments, and kinetics measurements. Some of these have some data acquisition that don’t allow for direct comparison, which in itself allows for discussion. For example, in a simple calorimeter calibration experiment, depending on the students use of hot and cold water temperatures used to calibrate, they will obtain different readings. This opens up the floor for a discussion on heat capacity. Kinetic measurements (e.g. clock experiments) allow for a discussion on how confident one can be in a time measurement – and again allows for discussion on being clear about when a time measurement is made in the reporting of the procedure (a perennial question is how does my procedure differ from the lab procedure – this kind of thing provides a great illustration how!). I should say that students are initially reluctant to commit their data to the public scrutiny of their colleagues, but once the initial entering begins, all inhibitions are lost.

Examples from literature

Either my searching isn’t great, or it’s just too “simple” to report in the literature, but I haven’t seen many examples. Ricci and Ditzler (1991) discuss pooling lab data in a discovery approach to chemistry learning. While this has a slightly different emphasis to the approach described here, the essentials are the same – students compile a range of data on either the same measurements or different measurements, and use the advantage of a large number of samples to deduce conclusions. The authors provide some examples on a measurements experiment (mass of a penny based on year minted), stoichiometric experiments (mass of silver halide precipitate based on halide), electron configuration (colour of transition metal complexes), and molecular structure (using IR frequencies). Olsen (2008) uses pooled data to illustrate statistical concepts (central tendency and distribution) in the laboratory, using the combustion of magnesium and determination of the molecular formula of the oxide as the basis for the data acquisition. McGarvey (2009) describes the use of data pooling in his teaching laboratory:

“Live data pooling, supported by projection of tabulated/graphical data on a large screen in the laboratory, provides the added dimension of enabling students to actually see the data develop before their eyes as the practical class proceeds; they can also see how their classmates are progressing and how their own results compare with those of fellow students. A large data set also lends itself to more meaningful data (and error) analysis and it is also easier to assess adherence to (and deviations from) theoretical predictions with a large data set, which is more instructive for students when carrying out data processing and analysis.”


McGarvey, DJ (2009) Enhancing undergraduate chemistry practicals using live data pooling, Eurovariety in Chemistry Education, Manchester. Proceedings at this link.

Olsen, RJ (2008) Using Pooled Data and Data Visualization To Introduce Statistical Concepts in the General Chemistry Laboratory, J. Chem. Ed., 85(4), 544.

Ricci, RW and Ditzler, MA (1991) A Laboratory-Centered Approach to Teaching General Chemistry, J. Chem. Ed., 68(3), 228.

Photo Credit