This article is one of a series of Experiments meant to teach students about how science is done, from generating a hypothesis to designing an experiment to analyzing the results with statistics. You can repeat the steps here and compare your results — or use this as inspiration to design your own experiment.
Research rarely turns out quite the way it’s planned. Mine certainly didn’t. I managed to create a gluten-free cookie that is the same size as a wheat cookie. I also found out that adding 1 teaspoon of xanthan gum — which can add springiness — made a cookie even chewier than my original. I now have the winning recipe for cookies that my friend Natalie safely can eat. But volunteers who tasted my last batches of cookies did not rate the gluten-free cookies as negatively as did taste-testers in the first experiment. Now I have to think about why I didn’t get the same results.
This exercise is not designed to make me feel bad. Instead, finding out why my results might change can give me ideas about how to make future experiments better — and more likely to yield the same results each time.
In my first experiment, people rated my original-recipe cookie better than my two gluten-free alternatives. But in my second experiment, the taste-testers liked all the cookies equally. It didn’t seem to matter whether they were gluten-free or not.
In reviewing my experiments, I can think of two likely explanations for these seemingly contradictory assessments. First, I think cookie color might have played a role in what my tasters thought of my cookies. But the people who tasted my cookies also were different in each experiment. Maybe they just had very different preferences.
One cookie, two cookie, red cookie, blue cookie
When I designed my experiment, I needed to find a way to tell apart cookies made with wheat flour and those that were free of wheat — and gluten. But I didn’t want to label each type of cookie and possibly bias my tasters. Instead, I wanted to make sure that my tasters had no clue which cookie had regular baking flour and which had some gluten-free alternative.
So I tinted them with food coloring. I made the wheat-flour cookies red. Those made with a rice (gluten-free) flour were green. Cookies made using a mix of non-wheat (gluten-free) flours were blue. In taste tests, people definitely preferred the red cookies over the blue and green ones.
But what if people found some of the non-red colors off-putting? Scientists have shown that a food’s color can affect how we judge its taste. When a food is a color that we expect (for example, light brown for chocolate-chip cookies) we think it has a more pleasant flavor than if it’s some odd color (such as green or purple).
To avoid this problem in my second experiment, I changed the cookies’ colors. This time, my original-recipe cookies were blue. My plain gluten-free cookies were yellow. And gluten-free cookies with the added xanthan gum were offered in green, red and purple tints.
After these adjustments, my tasters liked the cookies with xanthan gum best. But when it came to my wheat cookies and my gluten-free cookies without xanthan gum, people rated the blue and yellow cookies as equally good.
Was this because tasters preferred the wheat cookies when they were colored red, but not when I had tinted them blue? Or was it because they always preferred red or yellow cookies over those that were green or blue? Based on my data, I have no way of knowing.
If I were to do this experiment again, I now realize I should find another way to tell my cookies apart. I could try making all of my cookies in all of the colors and then see how this affected preferences for colors — even among the original-recipe cookies.
A reader on Twitter had another idea: Put cocoa in all of the cookies. Since the gluten-free flour and wheat flour may brown up differently, the cocoa should make them all equally brown. Then, she suggested adding M&M candies to the top. They come in different colors. So I could use different colored candies to code which recipe a cookie came from. Because most people know that all M&Ms taste like chocolate — no matter what their color — this coding should not affect how people rank a cookie’s taste.
Cookie tasters are people, too
In my first experiment, I set up a table in the town park near my home. I put out a local advertisement asking families to come by. I also asked people walking by to eat and rate my cookies. I brought the rest of the cookies to my office to be rated by co-workers. This means that for my first experiment, I had both adults and children judging my cookies. They found the gluten-free cookies least tasty.
But in the second experiment, no one showed up to my table in the town park. There was a football game that I didn’t know about, and everyone was out at the game. Back at the office, my coworkers had become pretty tired of my baking exploits. They no longer jumped at eating more cookies for science.
Luckily, I am in a community choir, the Capitol Hill Chorale. At choir practice, I was overrun with hungry volunteers. But this time there were no children among the taste testers. Even the adults, this time, tended to be older than in my first experiment. These adults rated the gluten-free cookies — with or without xanthan gum — as tasty as the wheat cookies.
So next time, I might want to make sure that I have similar demographics — meaning people of similar ages and ratios of males to females.
In fact, many scientific studies could be improved based on lessons learned during the testing. And that’s why reviewing such lessons is an important part of the scientific process. So, as I do my experiments again, I now know what I’ll need to change to help me really understand how the cookies crumble.
Follow Eureka! Lab on Twitter
(for more about Power Words, click here)
bias The tendency to hold a particular perspective. Scientists often blind subjects to the details of a test so that their biases will not affect the results.
control A part of an experiment where there is no change from normal conditions. The control is essential to scientific experiments. It shows that any new effect is likely due only to the part of the test that a researcher has altered. For example, if scientists were testing different types of fertilizer in a garden, they would want one section of it to remain unfertilized, as the control. Its area would show how plants in this garden grow under normal conditions. And that give scientists something against which they can compare their experimental data.
demographics The characteristics typical of people in a population or tested group. This can include their average height, weight, age, income or sex.
gluten A pair of proteins — gliadin and glutenin — joined together and found in wheat, rye, spelt and barley. The bound proteins give bread, cake and cookie doughs their elasticity and chewiness. Some people may not be able to comfortably tolerate gluten, however, because of a gluten allergy or celiac disease.
polymer Substances whose molecules are made of long chains of repeating groups of atoms. Manufactured polymers include nylon, polyvinyl chloride (better known as PVC) and many types of plastics. Natural polymers include rubber, silk and cellulose (found in plants and used to make paper, for example).
questionnaire A list of identical questions administered to a group of people to collect related information on each of them. The questions may be delivered by voice, online or in writing. Questionnaires may elicit opinions, health information (like sleep times, weight or items in the last day’s meals), descriptions of daily habits (how much exercise you get or how much TV do you watch) and demographic data (such as age, ethnic background, income and political affiliation).
statistics The practice or science of collecting and analyzing numerical data in large quantities and interpreting their meaning. Much of this work involves reducing errors that might be attributable to random variation. A professional who works in this field is called a statistician.
variable (in experiments) A factor that can be changed, especially one allowed to change in a scientific experiment. For instance, when measuring how much insecticide it might take to kill a fly, researchers might change the dose or the age at which the insect is exposed. Both the dose and age would be variables in this experiment.
xanthan gum A hydrocolloid made by the bacterium Xanthomonas campestris. It is a long-chained polymer often used in baking to make substances more elastic.