Tuesday, 28 July 2015

SAmpling method

Choosing a sampling method

Techniques > Research > Sampling > Choosing a sampling method

There are many methods of sampling when doing research. This guide can help you choose which method to use. Simple random sampling is the ideal, but researchers seldom have the luxury of time or money to access the whole population, so many compromises often have to be made.

Probability methods

This is the best overall group of methods to use as you can subsequently use the most powerful statistical analyses on the results.

Method	Best when
Simple random sampling	Whole population is available.
Stratified sampling (random within target groups)	There are specific sub-groups to investigate (eg. demographic groupings).
Systematic sampling (every nth person)	When a stream of representative people are available (eg. in the street).
Cluster sampling (all in limited groups)	When population groups are separated and access to all is difficult, eg. in many distant cities.

Quota methods

For a particular analysis and valid results, you can determine the number of people you need to sample.

In particular when you are studying a number of groups and when sub-groups are small, then you will need equivalent numbers to enable equivalent analysis and conclusions.

Method	Best when
Quota sampling (get only as many as you need)	You have access to a wide population, including sub-groups
Proportionate quota sampling (in proportion to population sub-groups)	You know the population distribution across groups, and when normal sampling may not give enough in minority groups
Non-proportionate quota sampling (minimum number from each sub-group)	There is likely to a wide variation in the studied characteristic within minority groups

Selective methods

Sometimes your study leads you to target particular groups.

Method	Best when
Purposive sampling (based on intent)	You are studying particular groups
Expert sampling (seeking 'experts')	You want expert opinion
Snowball sampling (ask for recommendations)	You seek similar subjects (eg. young drinkers)
Modal instance sampling (focus on 'typical' people)	When sought 'typical' opinion may get lost in a wider study, and when you are able to identify the 'typical' group
Diversity sampling (deliberately seeking variation)	You are specifically seeking differences, eg. to identify sub-groups or potential conflicts

Convenience methods

Good sampling is time-consuming and expensive. Not all experimenters have the time or funds to use more accurate methods. There is a price, of course, in the potential limited validity of results.

Method	Best when
Snowball sampling (ask for recommendations)	You are ethically and socially able to ask and seek similar subjects.
Convenience sampling (use who's available)	You cannot proactively seek out subjects.
Judgment sampling (guess a good-enough sample)	You are expert and there is no other choice.

Ethnographic methods

When doing field-based observations, it is often impossible to intrude into the lives of people you are studying. Samples must thus be surreptitious and may be based more on who is available and willing to participate in any interviews or studies.

Method	Best when
Selective sampling (gut feel)	Focus is needed in particular group, location, subject, etc.
Theoretical sampling (testing a theory)	Theories are emerging and focused sampling may help clarify these.
Convenience sampling (use who's available)	You cannot proactively seek out subjects.
Judgment sampling (guess a good-enough sample)	You are expert and there is no other choice.

Reference

http://changingminds.org/explanations/research/sampling/choosing_sampling.htm

Random Sampling

Random sampling is one of the most popular types of random or probability sampling.

Random Sampling, Explorable

In this technique, each member of the population has an equal chance of being selected as subject. The entire process of sampling is done in a single step with each subject selected independently of the other members of the population.

There are many methods to proceed with simple random sampling. The most primitive and mechanical would be the lottery method. Each member of the population is assigned a unique number. Each number is placed in a bowl or a hat and mixed thoroughly. The blind-folded researcher then picks numbered tags from the hat. All the individuals bearing the numbers picked by the researcher are the subjects for the study. Another way would be to let a computer do a random selection from your population. For populations with a small number of members, it is advisable to use the first method but if the population has many members, a computer-aided random selection is preferred.

Advantages of Simple Random Sampling

One of the best things about simple random sampling is the ease of assembling the sample. It is also considered as a fair way of selecting a sample from a given population since every member is given equal opportunities of being selected.

Another key feature of simple random sampling is its representativeness of the population. Theoretically, the only thing that can compromise its representativeness is luck. If the sample is not representative of the population, the random variation is called sampling error.

An unbiased random selection and a representative sample is important in drawing conclusions from the results of a study. Remember that one of the goals of research is to be able to make conclusions pertaining to the population from the results obtained from a sample. Due to the representativeness of a sample obtained by simple random sampling, it is reasonable to make generalizations from the results of the sample back to the population.

Disadvantages of Simple Random Sampling

One of the most obvious limitations of simple random sampling method is its need of a complete list of all the members of the population. Please keep in mind that the list of the population must be complete and up-to-date. This list is usually not available for large populations. In cases as such, it is wiser to use other sampling techniques.

Reference

https://explorable.com/simple-random-sampling

What is a parameter estimate

(also called a sample statistic)?

Parameters are descriptive measures of an entire population. However, their values are usually unknown because it is infeasible to measure an entire population. Because of this, you can take a random sample from the population to obtain parameter estimates. One goal of statistical analyses is to obtain estimates of the population parameters along with the amount of error associated with these estimates. These estimates are also known as sample statistics. A fitted distribution line is a curve based on the parameter estimates instead of on the true parameter values.

There are several types of parameter estimates:

Point estimates are the single, most likely value of a parameter. For example, the point estimate of population mean (the parameter) is the sample mean (the parameter estimate).
Confidence intervals are a range of values likely to contain the population parameter.

For an example of parameter estimates, suppose you work for a spark plug manufacturer that is studying a problem in their spark plug gap. It would be too costly to measure every single spark plug that is made. Instead, you randomly sample 100 spark plugs and measure the gap in millimeters. The mean of the sample is 9.2. This is the point estimate for the population mean (μ), and it informs you that the most likely value for the average gap for all spark plugs is 9.2 You also create a 95% confidence interval for μ which is (8.8, 9.6). This means that you can be 95% confident that the true value of the average gap for all the spark plugs is between 8.8 and 9.6.

Reference

http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/introductory-concepts/basic-concepts/parameters/

Biase

Bias

If the sampling distribution is known then the ability of the sample statistic to estimate the corresponding population parameter can be determined.
In particular, the sampling distribution determines the expected value and variance of the sampling statistic. If the expected value of the statistic is equal to the population parameter, the estimator is unbiased. If the variance of the statistic is 'small' and it is also unbiased then an observed statistic is likely to be close to the population parameter.
Bias = distance between parameter and expected value of sample statistics
Subsequently, sample statistics can be classified as shown in the following diagrams.

Estimates have low bias because their average is near the population parameter, but have high variability because they are widely spread and a single sample value could be far from the parameter.
Estimates have bias because the expected value is not equal to the parameter.
They also have high variability because they are widely spread out.
In this case the estimates are biased because all of them are systematically higher than the population parameter
The sample statistics have, however, low variability because they are all close together.

In this case the estimates have both low bias and low variability. Experimental design aims to simultaneously reduce bias and variability by producing a sampling distribution as shown in 4.
In general
sample statistic = population parameter + bias + chance variation
Inferences about the characteristics of a population are based on data from a sample.

If the sample is not representative of the population being studied, the sample statistic may be biased so you cannot use it to make valid inferences about the population parameter
To minimise bias the sample should be chosen by random sampling from a list of all individuals in the relevant population. This list is called the sampling frame. It is essential.
For a simple random sample the individuals are chosen in such a way that each individual in the sampling frame has an equal chance of being selected. This may involve using computer generated random numbers to select the sample.

Reference

https://surfstat.anu.edu.au/surfstat-home/4-1-1.html

What is the Difference Between a Statistic and a Parameter?

A statistic and a parameter are very similar. They are both descriptions of groups, like “50% of dog owners prefer X Brand dog food.” The difference between a statistic and a parameter is that statistics describe a sample. A parameter describes an entire population.

For example, you randomly poll voters in an election. You find that 55% of the population plans to vote for candidate A. That is a statistic. Why? You only asked a sample of the population who they are voting for. You calculated what the population was likely to do based on the sample.

You could ask a class of third graders who likes vanilla ice cream. 90% raise their hands. You have a parameter: 90% of that class likes vanilla ice cream. You know this because you asked everyone in the class.

If in doubt, think about the time and cost involved in surveying an entire population. If you can’t imagine anyone wanting to spend the time or the money to survey a large number (or impossible number) in a certain group, then you almost certainly are looking at a statistic.

Reference

http://www.statisticshowto.com/how-to-tell-the-difference-between-a-statistic-and-a-parameter/

Accuracy and Precision

Accuracy

Accuracy is how close a measured value is to the actual (true) value.

Precision

Precision is how close the measured values are to each other.

Bias (don't let precision fool you!)

When we measure something several times and all values are close, they may all be wrong if there is a "Bias"

Bias is a systematic (built-in) error which makes all measurements wrong by a certain amount.

Examples of Bias

The scales read "1 kg" when there is nothing on them
You always measure your height wearing shoes with thick soles.
A stopwatch that takes half a second to stop when clicked

In each case all measurements are wrong by the same amount. That is bias.

Reference

http://www.antarcticglaciers.org/glacial-geology/dating-glacial-sediments-2/precision-and-accuracy-glacial-geology/

http://www.mathsisfun.com/accuracy-precision.html

Population and Samples

Populations

In statistics the term "population" has a slightly different meaning from the one given to it in ordinary speech. It need not refer only to people or to animate creatures - the population of Britain, for instance or the dog population of London. Statisticians also speak of a population of objects, or events, or procedures, or observations, including such things as the quantity of lead in urine, visits to the doctor, or surgical operations. A population is thus an aggregate of creatures, things, cases and so on.

Samples

A population commonly contains too many individuals to study conveniently, so an investigation is often restricted to one or more samples drawn from it. A well chosen sample will contain most of the information about a particular population parameter but the relation between the sample and the population must be such as to allow true inferences to be made about a population from that sample.

Reference

http://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/3-populations-and-samples

sample and population difference

What is the difference between a population and a sample?

A sample is a subset of people, items, or events from a larger population that you collect and analyze to make inferences. To represent the population well, a sample should be randomly collected and adequately large.

To understand the basic foundation for hypothesis testing and other types of inferential statistics, it’s important to understand how a sample and a population differ.

A population is a collection of people, items, or events about which you want to make inferences. It is not always convenient or possible to examine every member of an entire population. For example, it is not practical to count the bruises on all apples picked at an orchard. It is possible, however, to count the bruises on a set of apples taken from that population. This subset of the population is called a sample.

If the sample is random and large enough, you can use the information collected from the sample to make inferences about the population. For example, you could count the number of apples with bruises in a random sample and then use a hypothesis test to estimate the percentage of all the apples that have bruises.

Reference

http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/introductory-concepts/basic-concepts/sample-and-population/

Monday, 27 July 2015

Variables and variation

VARIABLE is a measurable characteristic that varies. It may change from group to group, person to person, or even within one person over time. There are six common variable types:

Independent variable

An independent variable is exactly what it sounds like. It is a variable that stands alone and isn't changed by the other variables you are trying to measure. For example, someone's age might be an independent variable. Other factors (such as what they eat, how much they go to school, how much television they watch) aren't going to change a person's age. In fact, when you are looking for some kind of relationship between variables you are trying to see if the independent variable causes some kind of change in the other variables, or dependent variables.

Dependent variable

Just like an independent variable, a dependent variable is exactly what it sounds like. It is something that depends on other factors. For example, a test score could be a dependent variable because it could change depending on several factors such as how much you studied, how much sleep you got the night before you took the test, or even how hungry you were when you took it. Usually when you are looking for a relationship between two things you are trying to find out what makes the dependent variable change the way it does.

A variable is a quantity whose value changes.

A discrete variable is a variable whose value is obtained by counting.

Examples: number of students present

number of red marbles in a jar

number of heads when flipping three coins

students’ grade level

A continuous variable is a variable whose value is obtained by measuring.

Examples: height of students in class

weight of students in class

time it takes to get to school

distance traveled between classes

Reference

https://nces.ed.gov/nceskids/help/user_guide/graph/variables.asp

http://www.henry.k12.ga.us/ugh/apstat/chapternotes/7supplement.html

Friday, 24 July 2015

Short Communication

A short communication is for a concise, but independent report representing a significant contribution to Biotechnology.

Short communication is not intended to publish preliminary results. Only if these results are of exceptional interest and are particularly topical and relevant will be considered for publication.

It should be no more than 2500 words, and could include two figures or tables. It should have at least 8 references.

Short communications are also send to peer review.

Refernce

http://ejbiotechnology.ucv.cl/iaformato/short_communications.html

Thursday, 23 July 2015

Research Paper

What is a Research Paper?

"Research paper." What image comes into mind as you hear those words: working with stacks of articles and books, hunting the "treasure" of others' thoughts? Whatever image you create, it's a sure bet that you're envisioning sources of information--articles, books, people, artworks. Yet a research paper is more than the sum of your sources, more than a collection of different pieces of information about a topic, and more than a review of the literature in a field. A research paper analyzes a perspective or argues a point. Regardless of the type of research paper you are writing, your finished research paper should present your own thinking backed up by others' ideas and information.

To draw a parallel, a lawyer researches and reads about many cases and uses them to support their own case. A scientist reads many case studies to support an idea about a scientific principle. In the same way, a history student writing about the Vietnam War might read newspaper articles and books and interview veterans to develop and/or confirm a viewpoint and support it with evidence.

A research paper is an expanded essay that presents your own interpretation or evaluation or argument. When you write an essay, you use everything that you personally know and have thought about a subject. When you write a research paper you build upon what you know about the subject and make a deliberate attempt to find out what experts know. A research paper involves surveying a field of knowledge in order to find the best possible information in that field. And that survey can be orderly and focused, if you know how to approach it. Don't worry--you won't get lost in a sea of sources.

In fact, this guide is designed to help you navigate the research voyage, through developing a research question and thesis, doing the research, writing the paper, and correctly documenting your sources.

Reference

https://www.esc.edu/online-writing-center/resources/research/research-paper/

Scientific Control

A scientific control is an experiment or observation designed to minimize the effects of variables other than the singleindependent variable. This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method.

Controlled experiments

There are many forms of controlled experiments. A relatively simple one separates research subjects or biological specimens into two groups: an experimental group and a control group. No treatment is given to the control group, while the experimental group is changed according to some key variable of interest, and the two groups are otherwise kept under the same conditions.

Negative

Negative controls are groups where no phenomenon is expected. They ensure that there is no effect when there should be no effect. To continue with the example of drug testing, a negative control is a group that has not been administered the drug of interest. This group receives either no preparation at all or a shampreparation (that is, a placebo), either an excipient-only (also called vehicle-only) preparation or the proverbial "sugar pill." We would say that the control group should show a negative or null effect.

In an example where there are only two possible outcomes, positive and negative, then if the treatment group and the negative control both produce a negative result, it can be inferred that the treatment had no effect. If the treatment group and the negative control both produce a positive result, it can be inferred that aconfounding variable acted on the experiment, and the positive results are not due to the treatment.

In other examples, outcomes might be measured as lengths, times, percentages, and so forth. For the drug testing example, we could measure the percentage of patients cured. In this case, the treatment is inferred to have no effect when the treatment group and the negative control produce the same results. Some improvement is expected in the placebo group due to the placebo effect, and this result sets the baseline which the treatment must improve upon. Even if the treatment group shows improvement, it needs to be compared to the placebo group. If the groups show the same effect, then the treatment was not responsible for the improvement (because the same number of patients were cured in the absence of the treatment). The treatment is only effective if the treatment group shows more improvement than the placebo group.

Positive

Positive controls are groups where a phenomenon is expected. That is, they ensure that there is an effect when there should be an effect, by using an experimental treatment that is already known to produce that effect (and then comparing this to the treatment that is being investigated in the experiment).

Positive controls are often used to assess test validity. For example, to assess a new test's ability to detect a disease (its sensitivity), then we can compare it against a different test that is already known to work. The well-established test is the positive control, since we already know that the answer to the question (whether the test works) is yes.

Similarly, in an enzyme assay to measure the amount of an enzyme in a set of extracts, a positive control would be an assay containing a known quantity of the purified enzyme (while a negative control would contain no enzyme). The positive control should give a large amount of enzyme activity, while the negative control should give very low to no activity.

If the positive control does not produce the expected result, there may be something wrong with the experimental procedure, and the experiment is repeated. For difficult or complicated experiments, the result from the positive control can also help in comparison to previous experimental results. For example, if the well-established disease test was determined to have the same effectiveness as found by previous experimenters, this indicates that the experiment is being performed in the same way that the previous experimenters did.

When possible, multiple positive controls may be used — if there is more than one disease test that is known to be effective, more than one might be tested. Multiple positive controls also allow finer comparisons of the results (calibration, or standardization) if the expected results from the positive controls have different sizes. For example, in the enzyme assay discussed above, a standard curve may be produced by making many different samples with different quantities of the enzyme.

Randomization

In randomization, the groups that receive different experimental treatments are determined randomly. While this does not ensure that there are no differences between the groups, it ensures that the differences are distributed equally, thus correcting for systematic errors.

For example, in experiments where crop yield is affected (e.g. soil fertility), the experiment can be controlled by assigning the treatments to randomly selected plots of land. This mitigates the effect of variations in soil composition on the yield.

Blind experiments

In blind experiments, at least some information is withheld from participants in the experiments (but not the experimenter). For example, to evaluate the success of a medical treatment, an outside expert might be asked to examine blood samples from each of the patients without knowing which patients received the treatment and which did not. If the expert's conclusions as to which samples represent the best outcome correlates with the patients who received the treatment, this allows the experimenter to have much higher confidence that the treatment is effective.

The blinding eliminates effects such as confirmation bias and wishful thinking that might occur if the samples were evaluated by someone who knew which samples were in which group.

Double-blind experiments

In double-blind experiments, at least some participants and some experimenters do not possess full information while the experiment is being carried out. Double-blind experiments are most often used in clinical trials of medical treatments, to verify that the supposed effects of the treatment are produced only by the treatment itself. Trials are typically randomized and double-blinded, with two (statistically) identical groups of patients being compared. The treatment group receives the treatment, and the control group receives a placebo such as a sugar pill. The placebo is the "first" blind, and controls for the patient expectations that come with taking a pill, which can have an effect on patient outcomes. The "second" blind, of the experimenters, controls for the effects on patient expectations due to unintentional differences in the experimenters' behavior. Since the experimenters do not know which patients are in which group, they cannot unconsciously influence the patients. After the experiment is over, they then "unblind" themselves and analyse the results.

In clinical trials involving a surgical procedure, a sham operated group is used to ensure that the data reflect the effects of the experiment itself, and are not a consequence of the surgery. In this case, double blinding is achieved by ensuring that the patient does not know whether their surgery was real or sham, and that the experimenters who evaluate patient outcomes are different from the surgeons and do not know which patients are in which group.

Reference

https://en.wikipedia.org/wiki/Scientific_control#Types_of_control

Wednesday, 22 July 2015

Foot notes

footnote

(ˈfʊtˌnəʊt)

1. (Printing, Lithography & Bookbinding) a note printed at the bottom of a page, to which attention is drawn by means ofa reference mark in the body of the text

Reference

http://www.thefreedictionary.com/footnote

Acknowledgment

In the creative arts and scientific literature, an acknowledgment (also spelled acknowledgement) is an expression of gratitude for assistance in creating an original work.

Receiving credit by way of acknowledgment rather than authorship indicates that the person or organization did not have a direct hand in producing the work in question, but may have contributed funding, criticism, or encouragement to the author(s). Various schemes exist for classifying acknowledgments; Cronin et al.^[1]give the following six categories:

moral support
financial support
editorial support
presentational support
instrumental/technical support
conceptual support, or peer interactive communication (PIC)

Apart from citation, which is not usually considered to be an acknowledgment, acknowledgment of conceptual support is widely considered to be the most important for identifying intellectual debt. Some acknowledgments of financial support, on the other hand, may simply be legal formalities imposed by the granting institution.

Reference

https://en.wikipedia.org/wiki/Acknowledgment_(creative_arts_and_sciences)

References and Acknowledgment

In your academic work you are expected to draw upon evidence from, and substantiate claims with, up-to-date, relevant and reputable sources. Whenever you use information that has originally appeared in someone else’s work, you must acknowledge clearly its original source. Give the source of every instance of borrowing, whether from a primary document, a literary text or a secondary work. Each new act of borrowing, even from a source already cited, requires acknowledgement. The tradition of scholarship depends upon scrupulous acknowledgement of sources.

We reference material and ideas sourced from the work of others for several reasons. First of all it is required by custom and, in some cases, even by law, to give credit where it is due. References and citations acknowledge previous work conducted by other scholars, and indicate who is responsible for the different elements that have been brought together to make the essay a whole. In addition to this, we give references in order to distinguish ourselves as authors of our original work. The reader needs to know which ideas are your original contributions; the best way to do this is to indicate what is not yours (it should be considerably less than what is yours). Finally, we give references so that readers who are interested in the subject can do further reading, consult our sources independently, and verify our interpretation of the evidence.

You must clearly acknowledge your references when you quote (use the original source’s exact words), paraphrase (express a source’s ideas in different words) or summarise (outline the main points) information, ideas, text, data, tables, figures or any other material which originally appeared in someone else’s work. References may be to sources such as books, journals, newspapers, maps, films, photographs, reports, electronic sites or personal communications (for example, letters or conversations).

You must always make your acknowledgements in a consistent and recognisable format. References must provide enough bibliographic information for your reader to be able to find your source easily.

Reference

http://ehlt.flinders.edu.au/humanities/exchange/style/req_reference.html

Why do journals ask for keywords?

To publish is to make known. By publishing research papers, journals make research known to their readers. However, most researchers read only a few journals regularly. Typically, these are the journals that focus on topics most relevant to a researcher: those working on rheumatology, for example, may read the Journal of Rheumatology, while those working on environmental economics may read the Journal of Environmental Economics. In addition to these specialist journals, most researchers also read (or at least look at the contents page of) one or two multidisciplinary journals such as Nature or Science. Researchers read these journals to keep themselves updated. However, papers that are relevant to a particular researcher may appear in journals that the researcher does not see regularly or may not see at all. This is where search engines and indexing services prove useful—and they need keywords to do their job.

A keyword is a key to information. Keywords point researchers to relevant papers—papers that may not come to a researcher’s attention in the normal course of her or his reading. Relevant papers may escape notice because they are published in journals that a particular researcher does not read regularly. And even when such papers are published in journals that the researcher does read regularly, he or she may not realize that those papers are relevant because their titles may fail to indicate their relevance. Let us take an example to see why keywords are useful. A paper titled ‘New approaches to the treatment of diabetes’ describes how some medicinal herbs can help in treating the disease. However, the title does not mention this, nor does it mention the names of those herbs. Suitable keywords for such a paper will include the scientific names of those herbs, and a search for any of those names will lead other researchers to that paper.

Therefore, do not use words or terms in the title as keywords: the function of keywords is to supplement the information given in the title. Words in the title are automatically included in indexes, and keywords serve as additional pointers. Lastly, how should you pick keywords? Here are some suggestions to consider while selecting journal keywords:

If the paper focuses on a particular region (geographic, climatic, etc.), use that as a keyword (semi-arid tropics, the polar region, coniferous forests).
Consider the experimental material and techniques, which may suggest suitable keywords (HPLC, alkaloids, x-ray crystallography, animal dung).
Check whether potential applications can serve as keywords (organic farming, treatment of cancer, long-term preservation, energy efficiency).
Use specific phenomena or issues as keywords (climate change, air pollution, sustainable development, genetic engineering).
Do not use words or phrases from the title as keywords.

Reference

http://www.editage.com/insights/why-do-journals-ask-for-keywords