CHAPTER ONE
1.0 INTRODUCTION
Very often in practice a relationship
is found to exist between two or more variables. In some research problem, two
measurements are taken on each of the unit consideration. We may be interested
in finding a relationship between “height of students and their respective
Weights” Income and expenditure”, “unemployment and crime are” and so on.
The
statistical method used to find out if there exist any relationship between two
sets of variables and to establish an equation to represent this relationship
which can also be used for prediction is known as “CORRELATION AND REGRESSION”.
The measure of the classes or degree of relationship between two or more
variables is known as Correlation analysis while the equation of line that
represent the relationship between two or more variables which can also be
needed for prediction is being referred to as” Regression Analysis”
The major
focus of correlation and Regression is to study the changes in one variable
called the dependent variables “Y” that is brought about as a result of values
of the other variables called the independent variables ’X’. The independent
variable can usually be manipulated or controlled while the observed response
is recorded as the dependent variable. A mathematical model relating the two
variables can usually be formed. For this analysis, it is assumed that the
measurement is at least on the internal scale and that the dependent and the
independent variables have a linear relationship.
In the
course of this project, we shall consider sales and advertising expenditure Maltina,
drink one of the product of International breweries PLC, Ilesa, using the
statistical method correlation and Regression.
1.1 HISTORICAL BACKGROUND
International
Breweries PLC Ilesa was incorporated in Nigeria as a private Limited
liability company on 22nd December 1971 but commenced operation in December
1978. The objective was to establish a brewery to produce market high quality
beer with the brand name: Trophy” The company operates from its brewery, the
company also brews and sells the “major” brand of larger beer, a beer brewed
with 100% local inputs. Besides larger beer, the company equally produces and
bottle “Maltina ” a non-alcoholic beverage drink to increase the variety of its
products. Over the years, these products have not been accepted locally with NIS award, but also in
the international markets. All these products are now from 100% local cereals
and the company has developed unflagging resolve to stay in the front league in
the production of high quality products in over competitive beer and beverages
markets with the overall objective of satisfying the consuming public in
Nigeria.
International
Breweries PLC Ilesa has highly well organised sales and advertising units. These
units are under the marketing department of the company. The sales and
advertising managers who are also member of the sales and advertising
practitioner council of Nigeria (SAPCON) are headed by the head of the
marketing department. The creative section of the sales and Advertising units
of the company is under the leadership of a professional graphic artists. Each
of the company’s product is being handled by a separate advertising agencies.
The
company therefore uses the outdoor media (bill boards and street signs), print
media (newspaper) and electronic media (radio and television) as means of
advertising.
MALTINA
Maltina is one of the product produce
by International Breweries Plc and was introduce after the production the
Trophy to boost the company income. It is a non-alcoholic drink, which has
generally contributed to the nationwide recognition of the product and the
company. The liquid content constitute the following:
28CL
liquid content in a bottle made of the following ingredient; malted sorghum,
maize, sucrose.
The 40% of
the annual total sales at the company is generated from maltina and the company spent much in advertising this
product.
This cowpony also encounter some problems such as:
1.
GOVERNMENT POLICY:
This occurred generally through government intervention by introduction of
taxes and tariffs which affect the income of the company.
2.
COMPETITORS
STRATEGIES: This accrued, when there is close substitute due to the enactment
of decree 62 of 1979 constitution of the Federal republic of Nigeria which lead
to free establishment of both alcoholic and
non-alcoholic dunks, the company has been facing series of competition
from close substitutes in the market which reduces total profit and over all
income generation of the company. Example of close substitutes can be seen in
Nigeria Breweries Plc (NB) which manufactures malture which compete with this
product of the International Breweries Plc. Others products that complete with
this product are malta
Guness of Gueinss Plc, Hi-malt of consolidated breweries and so on.
3.
Technological
know-ho: Due to the uncontrolled an incessant inflation in the nation, the
company can only afford less expert in the manufacturing. This also lead to the
importation of machines which caitres the less development of the company.
4.
Financial problems:
This affects the growth of the company due to the creeping and hitting
inflation in the country. This has made the government stop financial
assistance grant render to companies. Both short and long-term loans have
dropped drastically.
1.2 AIMS AND OBJECTIVES
The main
aim and objective of this project work is part of the partial requirement for
the award of the National Diploma in statistic. It also to test the students knowledge as to applied
theoretical knowledge achieved in study of statistics as a course with
particular application to regression and correlation to real (practical) life.
We are to consider the use of correlation and regression analysis on a certain
product. The project objectives shall include the following.
·
To determine, if
there exists any relationship between advertising expenditure.
·
To determine the
dependency of advertisement expenditure an sales.
·
To measure the degree
of association between two variables.
·
To test for the
significance of regression and correlation co-efficient.
·
To verity the type of
relationship between the two variables
·
To use the necessary
concept is known is statistics in other to explain to explain the variation
between two variables or how sales affects advertising expenditure.
·
To forecast the
future behaviour of those sales and advertising expenditure.
·
To improve the
activities of International Breweries through reasonable recommendation based
on the result out statistical analysis.
1.3 RESEARCH HYPOTHESIS
The
following research hypothesis has been formulated;
Ho: There is a relationship between sales and advertising expenditure
of Maltina.
Hi: There is no relationship between sales and advertising
expenditure of Maltina.
1.4 SIGNIFICANCE OF THE STUDY
The study is very important
the sense that if enables us to really understand the rate at which sales
affects advert expenditure.
The data (information) used
in this research work is based on yearly sales and advertising expenditure of a
product. Although this of this nature might have been carried out before, it
significance has an effect on academic knowledge (both economically and
statically cannot be over emphasized.
1.5 SCOPE AND LIMITATION
This project with the topic
regression and correlation analysis will be used to analyse the data on the
sales and advertising expenditure of Maltina
non-alcoholic drink, of the product of international breweries plc Ilesa
from the year 2005-2014. The ten years set of data used in this study is a
sample taken from sales and advertising expenditure of Maltina non-alcoholic obtained from regression and
correlation coefficients are sample of the population parameter of the IBPLC
Ilesa. The data used is based on the period of ten years and analysis is done
on these data inferences are made and useful recommendation are made for the
company.
In the course of collecting
statistical data, there must be a problem associated to it. The problems may be
technical and altitudinal problems. In this problem some technical problems
forces are:
(i)
Inability to go to
every company that produces malta
drink to collect data. That is why we concentrate on International Breweries
Plc Ilesa.
(ii)
Time Constraint time
in collecting the data is very start.
(iii)
Inability to meet
despondence on time is another problem faced. The company was visit on several
occasions before data could be obtained.
(iv)
Economic problem such
as increase in transport fare also constituted a great problem.
(v)
Distance of the
company also affects.
1.6 DEFINITION OF SOME TERMS
(1)
REGRESSION: This is a
statistical device with the objective of analyzing the association between two
or more variables.
(2)
CORRELATION: This is
the measure of degree of association between two or more quantitative
variables.
(3)
DATA: Data is a set
of numerical information colleted for a particular purpose by an investigation.
i.
PRIMARY DATA: It
refers to the statistical data (information) which the investigator originate
himself for the purpose of enquiry in hard.
ii.
SECONDARY DATA: This
refers to those statistical data which are not originates by the investigator
himself, but obtained from some organizations, either in published or
unpublished forms.
(4)
BIVARIATE: This is a
set of values which appear in part whereby one value, Y depends on other value
X. Hence, sales is department variable Y, while advertising expenditure
incurred by the company is independent variable X.
(5)
D.F - Degree
of freedom.
SST - Sum
of square total
SSG - Sum
of square error.
SSA - Sum
of square among the group
SS - Sum
of square
MS - Mean
of square
Fial
- F
– test calculated value
ab - Parameters, regression coefficient
SXY - covariance
of X and Y
SXY - Covariance
of Y
SXX - Covariance
of X
n - Number
of observation
(6)
VARIABLE: This as a
factor that changes quantitatively or qualitatively.
(7)
DEPENDENT VARIABLE:
This is the variable that represents the observed outcome to an activity or a
venture. It is the outcome of an experiment of a production process. It be in
response to another variable that feeds it.
(8)
INDEPENDENT VARIABLE:
This is a variable that can be controlled or manipulated at will by the
investigator. It can be called an imput.
(9)
X – A bold face lower
case letter or the upper case form to represent a random variable.
Xi - represent the ith value
of the variable X
Ã¥ - represent addition therefore.
Ã¥Xi -
represent the addition of the value of the variable X
Ã¥XY - represent the addition
of the product of X and Y
Ã¥X2 - Represent
the addition of the square
of X
(10) IBPLC – This means International Breweries Public
liability company
(11)
CONFIDENCE INTERVAL: Internal: This can be simply
refer to as the region of acceptance.
(12) LINE OF BEST FIT: This is the line that best represent
the data on the graph.
CHAPTER TWO
2.0 LITERATURE REVIEW
2.1 REGRESSION AND CORRELATION ANALYSIS
In
this chapter, we shall discuss the various method that can be to determine if
there exist any relationship between two variables and shall be express
numerically.
REGRESSION: This is measure of
relationship between two or more variables (X, Y) where the value of one (Y) or
depends on the other X. regression analysis can so be refer to as the procedure
by which an algebraic equation is formulated to estimate the value of a
continuous random variable given the value of another quantitative variable.
The variable for which the value is estimated by the regression equation is
called the dependent variable; the variable used as the basis for the estimate
is called the independent variable (Leonard J. Kazmier, 1979). When there is
only one independent variable in the regression equation, we have a simple
regression. When the regression is a linear one, then we have a simple linear
regression. When we have more than two independent variables, the relationship
is a multiple one.
ASSUMPTION FOR THE USE OF REGRESSION
ANALYSIS
Regression
analysis is based on some assumptions, among these are:
(i)
The
relationship between the two variables must be a linear one.
(ii)
The
independent variables are fixed while the dependent variables are random.
(iii)
The
conditional distributions of the variable must have equal variances.
2.2 SCATTER DIAGRAM
This
is the diagram required to express the relationship between variables consider
the case of a simple linear regression, a set of pairs of values (X, Y) are
obtained, the points represented by (X, Y) are plotted Y against X the graph
(diagram) obtained is known as SCATTER DIAGRAM from this, one may judge whether
the variable are linearly associated or not. If the graph shows a linear
association, then the following regression model can be fitted into the data.
Y
= a + bxi + ei …………………(i)
This is the general model
that usually fitted, a
and b are
the population parameters, a
is called the intercept (value of Y when X = O), b is called the slope or the regression coefficient (the
rate at which variable Y changes for every unit change in variable x), ei
is the random error component.
The
sample estimate of Y is given as
Yest = a + bXi ……………………… (2)
Then Y = a + bXi + ei
…………………… (3)
If
data are available ‘a’ and ‘b’ can be obtained and so equation (2) can be
estimated. The second part ei
can be evaluated as (Y – Yest).
Example
of measurement that show a linear relationship
(i)
Price
of goods and quantity supply or demanded.
(ii)
Income
and expenditure.
(iii)
Experience
(year) and efficiency of cashier
2.2.1 SHAPES OF SCATTER DIAGRAM
When
the values of Y are plotted against the value of X, any of the following
diagrams may be obtained.
(i) (ii)
Y Y
X x x x
x x x x
x x X x
X X
Positive Linear Relationship Negative Linear Relationship
(v) (iv)
Y Y
X X
No correlation Positive
curve Line relationship
(vi)
Y
X
Negative
curve Line relationship
2.3 REGRESSION EQUATION
This
is the equation of the regression line for a linear regression. A simple linear regression equation is of the
form Y = a + bXi, where a is the Y intercept (the value of Y at the point where
X = 0) and b is the slope of the line (the change in Y which accompanies a
change of one unit in X). Ordinarily,
the numerical constants ‘a’ and ‘b’ are estimated from sample data, and once
they have been determined, we can substitute a given value of X into the
equation and calculate the predicted value of Y. The general equation of a straight line is
given by:
Y
= a + bXi, where;
b
= Slope (gradient) of the line
a
= Intercept variable
x
= Independent variable
Y
= Dependent variable
We
are to consider the following deduction from general equation of a straight
line;
i. The
line Y = a + bXi cuts the Y-axis at ‘a’
ii. The
line Y = a + bXi passes through the origin where a = 0
iii. The
line Y = a + bXi slopes from left to right when b > 0
iv. The
line Y = a + bXi slopes from right to left when b < 0.
2.4 METHOD OF FITTING REGRESSION LINE
By
observing the bivariate data alone, we could see that it help us a lot in deciding
whether or not two variables X and Y are correlated. A bivariate data (X, Y) is a set of values,
which appear in points. It is a data in which the value of one i.e. Y depends
on the other X. The best way to show or
determine whether or not any relationship exists between two variable X and Y
is by drawing a graph of bivariate data. This shows to be linear as all the
points i.e. (on or near a line is called a “regression” line) The regression
line can be fitted by;
(i)
Free
hand method
(ii)
Semi-average
method
(iii)
The
least square method
(iv)
The
mean method.
2.4.1 FREE HAND METHOD
The
regression line is fitted into the scatter diagram by eye. This method may not
produce unique regression line and regression coefficient as the regression
line drawn depends on individual judgment. It therefore clearly shown the
visible ways regression line could be fitted into scatter diagram using free
hand method.
The
major deficiency of this method is that, it did produce a unique answer in
deficiency that different individual obtain different “line of best fit”
according to one judgment.
2.4.2 THE MEAN METHOD
The grand mean X, Y of each the two
variables X and Y are computed and the point (X, Y) are plotted in the scatter
diagram. A line is then drawn to pass through the point (X, Y) in such a way
the number of point in the scatter diagram above and below the line are equal
or almost equal. The line so drawn is the regression line. The main
disadvantage of this method is that, it is subjective and may not lead to
unique regression line and regression coefficient.
2.4.3 SEMI AVERAGE METHOD
The techniques consist of separating
the data into equal parts or group, to plot the mean point for each group and
by joining these two points with a strength line. Hence the procedures are
discuss as follow:
STEP 1: Separate the bivariate data
into order by X-value
STEP2:
Split
the data into two equal group (part), a lower half and an half in case there is
an odd number of items.
To
get the intercept of Y, extend the straight line to cross the Y-axis and read
the Y value. This is ‘a’. The ‘b’ is obtained by calculating the ratio
of the difference between the two means of Y and the difference between the two
means of X.
2.4.4 THE LEAST SQUARE METHOD
This
is the best method to estimate ‘a’ and ‘b’ of the regression line with equation
Y = a + bxi. The regression is caused by
change in some systematic ways with a change in ‘X’ we should be able to
predict ‘Y’ from ‘X’.
Therefore,
regression can also be defined as the way by which the predictions are made and
how the accuracy is determined. This is
the actual line and the estimated line.
The line can be denoted by:
Y = a + bxi + ei. Where ei is the residual or
standard error term. In general the
expansion method from the above equation.
Y
= a + bxi + ei
ei
= Y – a – bxi
(Sum
of square error) SSE = åei2
:. SSE = Ã¥ei2 = Ã¥(Y – a – bxi) …………… (i)
Hence,
we have to reduce this by differentiating equation (i) with respect to a and b.
dSSE = -2Ã¥(Y - a – bxi) ……………… (ii)
da
dSSE = -2XiÃ¥(Y - a – bxi) ……………… (iii)
db
Equate equation (ii) and (iii) to
equal zero
-2Ã¥(Yi - a – bxi) =
0 …………… (iv)
-2xiÃ¥(Yi - a – bxi) =
0 …………… (v)
Divide equation (iv) by –2
Ã¥(Yi – a – bxi) =
0 …………… (vi)
XiÃ¥(Yi – a – bxi) =
0 …………… (vii)
Solving equation (vi)
Ã¥(Yi – a – bxi) =
0
Ã¥Yi – na – bÃ¥xi = 0
na
= åYi
– bÃ¥xi
Divided through by n
a
= åYi - båXi
n n
a
= – b
substitute for a = åYi - båXi
………… in equation (vii)
n n
i.e. XiÃ¥(Yi – a – bXi2)
= 0
Ã¥XiYi – aÃ¥Xi – bÃ¥Xi2 = 0
Substitute
a in the equation we have;
Ã¥XiYi – Ã¥Yi – bÃ¥Xi Ã¥Xi - bÃ¥Xi2 = 0
n
n
Ã¥XiYi – Ã¥XiYi – b(Ã¥Xi)2 - bÃ¥Xi2 = 0
n
n
multiply through by n.
nÃ¥XiYi – Ã¥XiÃ¥Yi – b(Ã¥Xi)2 - nbÃ¥Xi2 = 0
nÃ¥XiYi – Ã¥XiÃ¥Yi = b[nÃ¥Xi2 – (Ã¥Xi)2]
b
= nÃ¥XiYi – Ã¥XiÃ¥Yi
nÃ¥Xi2 – (Ã¥Xi)2
After obtaining ‘a’ and ‘b’ then,
their value can be substituted into he original equation to get the equation of
regression line i.e. Y = a + bx.
2.5 CHI - SQUARE TEST
The
relationship between tow variables can be tested through the use of chi-square test
(X2). Suppose that in a particular sample a set of possible event Ei,
E2, …, Ec are know to occur O1, O2,
…, Oc times respectively and that according to probability rules,
they are expected to occur e1, e2, …, ec times
respectively and O1, O2, …, Oc are called
observed frequency, e1, e2, …, ec are called
expected or theoretical frequency.
Our
aim is to estimate whether the observed frequency differs significantly or not
from the expected one. Hence a measure of discrepancy between the observe and
expected frequency is given by test statistics called “CHI – SQUARE” and it is
given by;
c2cal =
Note: When a variable can be
categories into two ways with r. degree of freedom level of first category and
c. degree of freedom level of second categorization for any set of n
observation and random variable, we have r x c contingency table which may be
arrange in the table below.
CATEGORIZATION
1
CATEGORIZATION
2 1 2 3 … C TOTAL
1 O11 O12 O13 … O1C O1
2 O21 O22 O23 … O2C O2
3 O31 O32 O33 … O3C O3
. . . . … . .
. . . . … . .
. . . . … . .
r Or1 Or2 Or3 … OrC Or
TOTAL O.1 O.2 O.3 … O.c n
The
hypothesis to be tested as follows:
H0: There is association
between sales and advertising expenditure
Hi: There is no association
between sales and Advertisement expenditure
2.6 TEST OF HYPOTHESIS
2.6.1 THE F - TEST
Consider
two independent samples taken from two population assumed to be normally
distributed let n1, n2 be the sample sizes drawn from the
population and S12, S22, be the
sample estimate of the population variances s12, s22 respectively. Then the F – statistic needed for the test
is F = S12/S22
……… (i)
The critical point of F at the
specified a
significance level is Fa
n1-1, n2-1, where n1-1, n2-1 is the
degrees of freedom.
Therefore
to test the identify of two distribution i.e. that the shape to test the
identify of two population have equal variance s12 = s22 the corresponding hypothesis is set as follows:
H0: s12 = s22 = s (Distribution
have the same shape)
H1: s12 ¹ s22
The
critical value of F is Fa/2 n1-1, n2-1 (two tailed test)
2.6.2 INTERVAL, ESTIMATE AND SIGNIFICANCE TEST
When
a population parameter is estimated by a single number the resulting estimate
is known as point estimate. On the other hand, it is an initial estimate if the
estimate lies within two numbers. This, the estimate m = 2, Ã¥ = 3.5 £ m < 0.05 are points and interval
estimate of N respectively = Therefore the level of significance is the area or
percentage of the standard distribution curve covered by the regression region.
For two ten led test 95% confidence level of each rejection region covers 2.5%
(or 0.025) and 5% (0.05) for one-tailed test of the standard no normal
distribution curve.
2.6.3 POSSIBLE TEST
TEST ON SINGLE MEAN: LET ‘X’ be a
sample size, ‘n’ obtained for population of unit of interest. Let N be the
population mean of the variable X (or the hypothesis besides means of x). When s is unknown and population is normal
but n < 30. We use t distribution.
tcal = at (n – 1) ………… (1)
The critical value of t is ta, n - 1 (one tail test).
When n is large i.e. n > 30 we use
the z- distribution for the statistic calculation
Z = …………………… (2)
The
critical value of z is za (one tail test) for one test that is based on proportion
we can use
Z = …………………… (3)
Z =
TEST FOR DIFFERENCE OF MEAN: In this
method of test, let X1, X2 be variable under
consideration respectively form population 1 and population 2. Let X1,
X2 be the sample mean, variance with S12, S22
being the sample values and standard deviation given s1, s2
we can calculate it when n is large or small.
(a)
For
large sample, when s1,
and s2 are known i.e. n is large use +
statistics
Zcal =
The critical value of Z is
Z1-a/2
(b)
For
small sample, when s1,
and s2 unknown i.e. n is small use
t-statistic
tcal =
where
sp = (n1 – 1)S12
+ (n2 – 1)S22
n1
+ n2 – 2
The critical value of t is t(1-a/2)v where V=n1 + n2 -
2
Note: Thus tests depends on the type
of test to be carried out whether two-tail test or one tail-test.
2.7 CORRELATION ANALYSIS
If two qualities vary in such a manner
that movement in one is associated with movement in the other, the quantities
are said to be correlated.
Correlation analysis is a technique
for estimating the closeness or degree of relationship between two or more
variable. Thus, in correlation analysis we try to determine how well a linear
equation or other equation describes or explains the relationship between
variables. When only two variables i.e. (X and Y) are involved, the correlation
is said to be SIMPLE. When more than two
variables are involved then we speak of MULTIPLE CORRELATION. Correlation helps
to know more about the relationship between two or more variable in particular.
It does not tell us the cause of this relationship, it only shows whether the
relationship exists or not. In cause of this study we shall unit our discussion
to simple correlation. That is, measure of degree of association between tow
variable only.
2.7.1 PATTERNS
OF CORRELATION
(a) (b)
Y Y
X x
X x
X x
X X
(r
= 1) (r
= -1)
Perfect and positive correlation Perfect and negative correlation
(c) (d)
Y Y
X X
(r
< 1) (r
< 0)
Positive correlation Negative
correlation
(e)
Y
X
(r
= 0)
No correlation
Note: If all the points of the scatter
diagram lie on a strength line then we say that two variable are perfectly
correlated (a and b). Perfect correlation could however be position or negative
depending on whether Y increase as X increases (a) or whether Y decreases as X
increases (b)
If Y tends to increase as X increases
as in figure (c), then the correlation is said to be positive or direct. On the
other hard, if Y tends to decrease as X increases as in figure (d). Then the
correlation is called negative or inverse correlation. If there is no definite
pattern in the direction of the variables X and Y, then we say the variables
are uncorrelated or have zero correlation (figure (e)).
2.7.2 INTERPRETATION OF RESULTS
(i) r = + 1 means
the variable have a perfect positive correlation
(ii) r = - 1 means
the variable have a perfect negative correlation
(iii) r = o means
there is no correlation between the variable
(iv) -1<
r<o means there is an
imperfect negative correlation between the variable
(v) O<r<1 means there is an imperfect
positive correlation between the variables
It
should be noted that the greater the magnitude o + r, the more impressive the
association. For example, if r, = 0.65 and r2 = 0.8 then the
variable in the second are more positively correlated that the first.
2.7.3 ASSUMPTION FOR THE OF CORRELATION ANALYSIS
Correlation analysis is based on some
assumptions, among these are;
(i)
The
relationship between the two variables must be a linear one.
(ii)
Both
variable are random variables
(iii)
The
conditional distribution for each variable must have equal variances.
(iv)
Successive
observation for each variable are uncorrelated.
2.8 METHOD OF FINDING CORRELATION
2.8.1 SCATTER DIAGRAM
Scatter
diagram can be used to estimate the existing of correlation between two variable.
It is the graphical representation of bivariate data, which help to know the
correlation (association) between the variables. This may however, be
positively or negatively correlated. The dependent variable ‘Y’ is plotted
along the vertical axis and the independent variable ‘X’ is plotted along the
horizontal axis.
TYPES OF SCATTERED DIAGRAM
(i)
Correlation
can be graphical presented as:
(ii)
Direct
positive linear scatter diagram;
i. This
is an indication that as X increases; the value of Y also increases.
Y
X
X b
> 0
X
X
ii. Direct negative linear scatter diagram.
This
explains that as X increases the value of Y tend to decreases.
(a) (b)
Y
x
x
x
X
(b
< 0)
ZERO LINEAR SCATTER DIAGRAM
This
is observed when all points of scatter diagram do not follow a specific pattern
i.e. X do not produce any useful information about Y OR, we can say there is no
relationship between two variables.
Y
X
(b
= 0)
2.8.2 CORRELATION COEFFICIENT
The coefficient of correlation r, may
be defined as the square root of the product of the two regression coefficient.
That is, r = ±
When; byx = regression of Y on X
bxy = regression of X on Y
Also;
r = covariance of Y
on X
Standard
deviation of ‘x’ multiplied by standard
Deviation of “Y”
r = s2xy
sxsy
Where; s2xy = covariance of X and Y
s2x = Standard deviation of X and s2x = variance of X
s2y = Standard deviation of Y and s2y = variance of Y
From the explanation of correlation
coefficient, it can be show that “r” can take any value between –1 and = +1.
The
r = +1 when there is a perfect relationship between X and Y with a unit
increase in X always leads to a constant increase in Y.
R = -1 when there is a unit increase
in X leading to a constant decrease in Y.
R = V when there is no relationship
between X and Y.
The
major method used for calculation the correlation coefficient are:
(i)
Karl,
Pearson’s product moment coefficient of correlation (r)
(ii)
Spearman’s
Rank correlation coefficients (R)
2.8.3 KARL PEARSON’S PRODUCT MOMENT CORRELATION COEFFICIENT
The
Karl, Person’s product moment correlation coefficient (or coefficient of
correlation is denoted by raw given by:
r
=
Where
-1 £ r £ 1
Limitation: It cannot be used, can
there is direct quantitative measurement of phenomenon under study is not
possible.
COEFFICIENT OF DETERMINATION
When
one variable is used to predict the other, it is usually very important to
asses its usefulness between it may be used to product future values with some
measure of satisfaction. The coefficient of determination is the population of
the total variation in the dependent variable that can be accounted for by the
independent variable in the equation. This tells us the percentage explanation
of the total variation that the independent variable can other. The coefficient
of determination is defined according to Leonard (1979) as:
r2
=
OR
r2
=
2.8.4 SPEARMAN’S RANK CORRELATION COEFFICIENT
The spearman
rank correlation is the non-parametric equivalent of the parametric correlation
coefficient describe above. It data are in form of ranks or they can
conveniently be ranked then we can apply this measure of correlation. If we
have to rank the measures our self, there is the possibility that some value
will be repeated. When values are repeated, we have to assign their average
rank to each of them. The coefficient is defined as;
Rxy = -
Where d = Rank (X) – Rank (Y) n =
number of object being ranked
LIMITATIONS
(i)
The
main limitation of this is that, it is not as accurate as Karl person’s coefficient
of correlation.
(ii)
The
will not be appropriate where n is more than thirty 930) unless the original
data are ranked instead of scores.
2.9 TEST OF HYPOTHESIS (ii)
A statistical hypothesis is a
statement about a distribution of a number of random variable intended to
correspond to some statement about the real world.
Hence, statistical hypothesis is a
statement about a population, which we want to verity on the basis of
information available from a sample.
TYPE OF HYPOTHESIS
a.
THE
NULL HYPOTHESIS (H0): This is the hypothesis we accept for the
purpose of rejection. This hypothesis is always state first. It is denoted by
H0.
b.
THE
ALTERNATIVE HYPOTHESIS (Hi); This is the hypothesis we reject for
the purpose of acceptance. It indicates the direction of the expected results.
It is also denoted by Hi Example of test of hypothesis.
H0
: m1 = m2
H1
: m1 ¹
m2
TYPE I
ERROR: This is the probability of rejecting the null hypothesis when the fact –
its true. It is called producers risk (a)
TYPE II ERROR: This is the probability
of accepting the null hypothesis when the fact is false. It is called consumer
risk (b).
CRITICAL
REGION: Critical region is divided into two parts/regions: the acceptance
region and the rejection region. The acceptance region is turned the confidence
regions. The region is turned critical region when a value is estimated by a
simple value the estimate is accepted if it falls within the confidence region
(confidence ranges) with assurance chance of 100 (1-a)%. If it falls outside this
range the estimated value is rejected implying that it is not worthy to
represent the predicted/ specified value.
|
|||
a
1 - a 1 - a
(a) (b)
|
1 - a
(c)
Rejection Regions
n > 30
When
“n” is large Z – distribution is used (n > 30) and when “n” is small,
t-distribution we make used of (n < 30). Where “n” is the number of
observations
t =
H0: r = O (then is
relationship between sales and advertisement)
Hi: r = O (then is no
relationship between sales and advertisement)
If t – calculated > t –
tabulated reject null hypothesis (H0)
If t – calculated < t – tabulated
accept null hypothesis (H0)
2.9.1 ANALYSIS OF VARIANCE ANOVA
The
analysis of variance is a statistic to test the null hypothesis for multiple
means, H0;m1
= m2 = m3 … = mk,
using the F-ratio which is an extension of the t-test. The F-ratio is defined
as,
F
= variance estimate between groups
……………… (i)
Variance
estimate within groups
The
difference between this test and t-test is that while F-ratio uses ratio of two
variance estimate to test whether the multiple means are significant different,
the t-test uses a ratio of difference and within group variation to test whether
the two means are significantly different.
ANOVA
can be classified according to the number of independent variable involved into
study. When only one independent variables is involved in the study of the
dependent variable the ANOVA is termed one-way. If two independent variables
are used to study the dependent variable we have tow-way ANOVA.
Analysis
of variance is;
SYY
= bxy + SSE
Tss
= SSR + SSE
Where SSE = sum of square of errors
TSS
= total sum of squares
SSR
= sum of squares of regression
Source of variation
|
Degree of freedom
|
Sum of square
|
Means sum of square
|
F – ratio
|
Regression
Error
Total
|
1
n-2
n-1
|
SSR
SSE
TSS
|
SSR/1 = MSR
SSE/n-2=MSE
|
MSR
MSE
|
From the above table, we can see that,
SSR and SSE are values of independent chi-square variable with 1 and n-2 degree
of freedom respectively. Also, TSS is a value of chi-square variable with n-1
degree of freedom
Where;
MSR = mean sum of regression
MSE = sum of error
SIGNIFICANCE TEST “r”
If
H0 is true, we would expect r to be distributed about zero with
standard error;
Sr =
2.10 LIMITATION OF REGRESSION AND CORRELATION ANALYSIS
(i)
All
assumption for their use must be satisfied
(ii)
Extrapolation
may be clangorous because there is no statistical basis to assume that the
linear relationship will apply outside the range of the sample data.
(iii)
A
significant correlation does not necessarily indicate causation but indicates
just a common linkage in a sequence of event.
(iv)
Correlation
coefficient may be misleading if it is a spurious one.
CHAPTER THREE
3.0 RESEARCH METHODOLOGY AND ANALYSIS OF DATA
3.1 INTRODUCTION
This chapter presents the design of the study, the area
of the study, sampling procedures, the sample, instrumentation, validity of the
instrument reliability of instrument, procedure for data collection and method
of data analysis
3.2 THE DESIGN OF THE STUDY
This work is designed to look into the correlation and
regression analysis on sales advertising expenditure of maltina of
International Breweries Plc, Ilesa in Osun state with the application at some
statistical concepts to analysis the data collected for the period of ten years
to determine if there exist any relationship between sales and advertising
expenditure of maltina and to advice the company accordingly.
3.3 AREA OF THE
STUDY
This research was carried out in Osun state
particularly in International Breweries Plc Ilesa in Nigeria as a private
Limited liability company on 22nd December 1971 but commenced operation in
December 1978. The objective was to establish a brewery to produce market high
quality beer with the brand name: Trophy” The company operates from its
brewery, the company also brews and sells the “major” brand of larger beer, a
beer brewed with 100% local inputs. Besides larger beer, the company equally
produces and bottle “Maltina” a non-alcoholic beverage drink to increase the
variety of its products. Over the years, these products have not been accepted
locally with NIS award, but also in the international markets. All these
products are now from 100% local cereals and the coma pony has developed
unflagging resolve to stay in the front league in the production of high
quality products in over competitive beer and beverages markets with the
overall objective of satisfying the consuming public in Nigeria.
International Breweries PLC Ilesa has highly well
organised sales and advertising units. These units are under the marketing
department of the company. The sales and advertising managers who are also
member of the sales and advertising practitioner council of Nigeria (SAPCON)
are headed by the head of the marketing department. The creative section of the
sales and Advertising units of the company is under the leadership of a
professional graphic artists. Each of the company’s product is being handled by
a separate advertising agencies.
3.4 SOURCE OF DATA
The major source of data used in this study is secondary data
which refers to statistical data which are obtained from organizations, either
in published or unpublished forms.
3.5 METHODS OF DATA COLLECTION
The method used for the collection the data is transcription
from records.
Which is a useful method used when a particular purpose is already
recorded in a register maintained in one or more departments making it easier
to collect directly from the maintenance unit of the international Breweries
plc, thesa, in ogun stae.
3.6 PROBLEMS OF DATA COLLECTION
In Nigeria and indeed in any thirst world country
accurate data may be very hard to get. This may b due to many reasons ranging
from the individual to corporations, agencies and even the government as a
body. Some of the reasons are listed below as:
i.
Lack of proper
communication between users and productions of statistical data.
ii.
Difficulty in
estimating variables, which are of interest to planner.
iii.
Ignorance and
illiteracy of respondent
iv.
High proportion of
non-response due to suspicion can the part of respondents.
v.
Luck of frames from
which samples can be selected.
vi.
Wrong ordering of
priorities including misdirection of emphasis and bad utilization of human and
material resources.
3.7 PRESENTATION OF DATA
Graphs charts and tables are used to present information
make analysis and interpretation of statistical data. They form the excellent
way of condensing information and fastest means through which we receive
complex information. After the collection of data, the first step is to compute
the data by editing verity checking and coverage for computation summary and
coding.
3.8 THE SALES AND ADVERTISING EXPENDITURE OF MALTINA OF INTERNATIONAL BREWERIES PLC ILESA FROM (2005-2014)
YEAR
|
ADVERTISING EXPENDITURE
|
SALES
|
2005
|
279,990
|
527,567
|
2006
|
396,513
|
568,095
|
2007
|
222,280
|
1,033,307
|
2008
|
970,611
|
4,758,281
|
2009
|
1,334,972
|
7,413,610
|
2010
|
6,350,408
|
11,943,641
|
2011
|
8,507,623
|
13,943,773
|
2012
|
10,526,471
|
14,430,910
|
2013
|
13,291,765
|
15,200,669
|
2014
|
17,224,147
|
21,123,156
|
Source: International Breweries Plc, Ilesa.
CHAPTER FOUR
4.0 ANALYSIS AND INTERPRETATION OF DATA
Here, we will
deal with the mathematical or computational analysis of data collected. That
is, the sales and advertising expenditure of Bataroalt drink of international
Breweries Plc, Ilesa between the periods of 1993 to 2002. The advertising
expenditure, which is the independent is represented by X and sales the
dependent variable id represented by Y.
YEAR
|
ADVERTISING EXPENDITURE
|
SALES
|
2005
|
279,990
|
527,567
|
2006
|
396,513
|
568,095
|
2007
|
222,280
|
1,033,307
|
2008
|
970,611
|
4,758,281
|
2009
|
1,334,972
|
7,413,610
|
2010
|
6,350,408
|
11,943,641
|
2011
|
8,507,623
|
13,943,773
|
2012
|
10,526,471
|
14,430,910
|
2013
|
13,291,765
|
15,200,669
|
2014
|
17,224,147
|
21,123,156
|
SOURCE: The sales and
advertising expenditure Manager’s Office, International Breweries Plc, Ilesa.
In order to make the computation or
mathematical aspect easy the original data were coded by putting the figures in
hundred thousand as shown in the
Table 4.1 below;
YEAR
|
ADVERTISING EXPENDITURE
|
SALES |
2005
|
2.80
|
5.28
|
2006
|
3.96
|
5.68
|
2007
|
2.22
|
10.33
|
2008
|
9.71
|
47.58
|
2009
|
13.35
|
74.14
|
2010
|
63.50
|
199.44
|
2011
|
85.08
|
139.44
|
2012
|
105.26
|
144.31
|
2013
|
132.92
|
152.01
|
2014
|
172.24
|
211.23
|
4.1 ANALYSIS OF THE DATA
The data
presented earlier shall be analyzed in order to draw conclusion and make a
reasonable recommendation. In this analysis, the procedure shall be adopted.
i.
Regression
analysis using least square method
i.
Correlation
analysis using karl Pearson’s product moment and spearman’s rank correlation analysis
method
4.1.1 REGRESSION ANALYSIS USING LEAST SQUARE METHOD
YEAR
|
X(00,000)
|
Y(00,000)
|
XY
|
X2
|
Y2
|
2005
|
2.80
|
5.28
|
14.78
|
7.84
|
27.88
|
2006
|
3.97
|
5.68
|
22.55
|
15.76
|
32.26
|
2007
|
2.22
|
10.33
|
22.93
|
4.93
|
106.71
|
2008
|
9.71
|
47.58
|
462.00
|
94.28
|
2,263.86
|
2009
|
13.35
|
74.14
|
989.77
|
178.22
|
5,496.74
|
2010
|
63.50
|
119.44
|
7,584.44
|
4,032.25
|
14,265.91
|
2011
|
85.08
|
139.44
|
11,863.56
|
7,338.61
|
19,443.51
|
2012
|
105.26
|
144.31
|
1,190.07
|
11,079.67
|
20,825.38
|
2013
|
132.92
|
152.01
|
20,205.16
|
17,667.73
|
23,107.04
|
2014
|
172.24
|
211.23
|
36,382.25
|
29,666.62
|
44,618.11
|
TOTAL
|
591.05
|
909.44
|
92,737.51
|
69,985.91
|
130.187.40
|
Regression
model Y = a + bXi + ei
Y
= a + bXi + ei
From
the above information, we can use the normal equation to find the value of ‘a’
and ‘b’.
b =
|
|
b
= SXY/SXX
where; n = 10, åX = 591.05, åXY = 92737.51, åY = 909.44,
åX2 = 69985.91 åY2 = 130187.40
b
= 10(92,737.51) – (591.05) (909.44)
10(69,985.91) – (591.05)2‑
= 927,375.1 -
537,524.51
699859.1
- 349340.10
= 389850.59
350519.00
b
= 1.1122
b » 1.11 to 2 decimal place
OR
SXY = åXiYi -
(åXiåYi)
n
SXX
= åXi2
– (Ã¥Xi)2,
SYY = Ã¥Yi2 – (Ã¥Yi)2
n n
Therefore, having recall that åXiYi = 92,737.51
åXi = 591.05, åYi = 909.44, åXi2 = 69,985.91
(Ã¥Xi)2 = 349340.10
SXY
= 92,73.751 - (591.05 X 909.44)
10
=
92,737.51 – 53752.45
SXY = 38985.06
SXX = Ã¥XiYi Xi2 – (Ã¥Xi)2
n
=
69985.91 – 349340.10
10
=
69985.91 – 34934.01
SXX = 35051.90
SYY
= åYi2
– (Ã¥Yi)2
n
= 130,187.40 – 82708.11
SYY = 47,479.29
|
b
= SXY/SXX
= 38985.06
35051.90
|
b
= 1.11 to 2 d.p
|
|
|
|
|
a = Y – bXi
= åYi - båXi
n n
Recall
that åY =
909.44, åXi
= 591.05, n = 10
a = 909.44 -
1.11 591.05
10 10
=
90.94 – 1.11(59.11)
=
90.94 – 65.61
a
= 25.33
Therefore, the regression
equation line which is Y = a + bx will be Y = 25.33 + 1.11x
This indicates a positive
relationship between the sale and advertising expenditure of Maltina’.
|
YEAR
|
ADVERTISING EXPENDITURE (X)
|
SALES (Y)
|
Y ESTIMATED
|
2005
|
2.80
|
5.28
|
28.44
|
2006
|
3.97
|
5.68
|
29.74
|
2007
|
2.22
|
10.33
|
27.79
|
2008
|
9.71
|
47.58
|
36.11
|
2009
|
13.35
|
74.14
|
40.15
|
2010
|
63.50
|
119.44
|
95.82
|
2011
|
85.08
|
139.44
|
119.77
|
2012
|
105.26
|
144.31
|
142.17
|
2013
|
132.92
|
152.01
|
172.87
|
2014
|
172.24
|
211.23
|
216.52
|
4.1.2 CONFIDENCE INTERVIEW AND TEST OF HYPOTHESIS ON b
CONFIDENCE INTERVAL: The statistic
T-distribution can be used to construct A 100 (1 - a)% confidence interval for the
parameter b
A
100(1 - a)% C.I
for the parameter b
in the regression line E(Yi) = a + bxi implies
t = b - b0
S/ÖSxx
Confidence interval at 100 (1 - a)% with o.05 level of significance and
n-2 degree of freedom.
Confidence interval for b is given by;
Pr -t(1-a), n-2 £ £ t(1-a), n-2
= 1 - a
-t(1-a), n-2 S/ÖSxx £ b - b0 £
t(1-a), n-2 S/ÖSxx
Rearrange to reflect b
b - t(1-a), n-2 S/ÖSxx £ b0 £
b + t(1-a), n-2 S/ÖSxx
Where; b = b = 1.11
SXX
= 35051.90
a = 0.05
Degree of freedom = n – 2 = 10 – 2 = 8
Therefore: t(1-a/2), n-2
=
t(1 – 0.05/2), 1 – 2
= t(1 – 0.025),8
=
t0.975,8
=
2.306
» 2.31
To calculate for S i.e.
S =
Where: SSE = Syy – b2Sxx
Therefore; S=
Recall that;
Syy = 47479.29
Sxx = 35051.90
b= 1.11
=
=
=
=
S = 23.16
So also;
Sxx
= 35051.9
=
=
187.22
We have;
b-t(1-a/2), n-2 £ bo £
b + t(1-a/2),n-2
=
1.11. (2.306) (23.16) £ bo £
1.11 + 92.306 (23.16)
187.22
187.22
= 1.11.(2.306 X 0.1237) £ bo £
1.11 + (2.306X 0 .1238)
= 1.11- 0.2853 £ bo £
1.11 + 0.2853
= 0.8247 £ bo < 1.3953
4.1.3 TEST OF HYPOTHESIS ON b
H0: bo = (there is no significance difference between sales an
advertising expenditure).
Ho: bo # 0 (there is significance difference between sales an
advertising expenditure).
Using t – distribution with n-2 degree
of freedom to establish the critical region and base our decision on the
formula.
t = b - b0
S/ÖSxx
The computed value of t is given by;
t=
with n-2 d.f
and 0.05 level of significance
We know that;
b
= = 1.11 bo
= 0
S
= 23.16
= 187.22
t =
1.11 – 0
23.16/187.22
=
1.11
0.1237
tcal = 8.97
to calculate for t
ttab = (1-a/2), n-2
= 0.05
n = 10
= t(1-0.05/2),
10-2
= t(1
– 0.025),8
= t(0.975),
8
ttab = 2.306 2.31
CRITICAL REGION: Reject Ho if tcal
is greater than tab i.e. (tcal > ttab). An accept Hi
and if t – cal < ttab H0 is accepted and Hi is
rejected.
CONCLUSION: Since tcal is greater than ttab,
i.e. (8.97 > 2.306) then reject H0 and accept Hi. This indicates
that there is a linear relationship between sales and advertisement.
4.1.4 CONFIDENCE
INTERVAL AND TEST OF HYPOTHESIS FOR a CONFIDENCE INTERVAL: The t – statistic can also be
used in construction of 100(1 - a)
% confidence Interval fro parameter a
A
100 (1 - a)%
confidence interval for parameter a Implies
t = a - ao
Confidence interval at 100 (1 - a)% with 0.05 level of significance and n - 2 degree of
freedom.
Confidence
Interval for a is given
by;
pr - t(1-a/2), n-2 £ £ t(1-a/2), n-2 = 1-a
= - t(1-a/2), n-2 £ £ t(1-a/2), n-2
Rearrange to reflect a
= a - t(1-a/2), n-2 £
a0 £ a + t(1-a/2), n-2
where a = a = 25.33
t(1-a/2), n-2 = t(1- 0.05/2),
10-2
=
t(1- 0.025), 8
=
t(0.975), 8
t(1-a/2), n-2 = 2.306
S =
S
= 23.16
n
= 10
Sxx
= 35051.90
Ã¥X2 = 69985.91
Therefore;
=
=
=
= 0.4467
=
23.16
0.4469
= 51.82
To compute the confidence interval;
=
a - t(1-a/2), n-2 £
a0 £ a + t(1-a/2), n-2
25.33 – (2.306 x 51.82) £
a0 £ 25.33 + (2.306 x 51.82)
25.33 – 119.50 £
a0 £ 25.33 + 119.50
-
94.17 £ a0 £
144.83
TEST OF HYPOTHESIS FOR a
H0: a = 0 (there is relationship between
advertisement (X) and ales (Y))
H1: a ¹ 0 (there is no relationship between
advertisement (X) and sales (Y))
The hypothesis for a is given by:
t = a - ao
The computed value of t is given by
t
= a - ao
With n –2 degree of freedom and 0.05
level of significance
Recall that;
a = 25.33
a0 = 0
S = 23.16
Ã¥X2 = 69985.91
Sxx = 35051.9
n = 10
But;
= 51.82
To compute for tcalculated
tcal = a - ao
= 25.33 – 0
51.82
=
25.33
51.82
tcal = 0.4881
To compute for ttabulated
ttab = t(1-a/2), n-2
=
t(1-0.05/2), 10 - 2
=
t(1-0.025), 8
=
t0.975, 8
ttab = 2.306
CRITICAL REGION: Reject H0
if tcal is greater than ttab and accept H1, if
otherwise accept H0 and Reject H1.
CONCLUSION: Since tcal is
less than ttab i.e. (0.4881 < 2.306) H0 is accepted,
and we conclude that a
= 0.
This indicates that, there is
relationship between the two variable X and Y
4.2 CORRELATION
ANALYSIS USING KARL PEARSON’S PRODUCT MOMENT
CORRELATION CO-EFFICIENT.
The
Karl Pearson’s product moment correlation co-efficient is given by:
YEAR
|
X (00,000)
|
Y (00,000)
|
XY
|
X2
|
Y2
|
2005
|
2.80
|
5.28
|
14.78
|
7.84
|
27.88
|
2006
|
3.97
|
5.68
|
22.55
|
15.76
|
32.26
|
2007
|
2.22
|
10.33
|
22.93
|
4.93
|
106.71
|
2008
|
9.71
|
47.58
|
462.00
|
94.28
|
2,263.86
|
2009
|
13.35
|
74.14
|
989.77
|
178.22
|
5,496.74
|
2010
|
63.50
|
119.44
|
7,584.44
|
4,032.25
|
14,265.91
|
2011
|
85.08
|
139.44
|
11,863.56
|
7,338.61
|
19,443.51
|
2012
|
105.26
|
144.31
|
1,190.07
|
11,079.67
|
20,825.38
|
2013
|
132.92
|
152.01
|
20,205.16
|
17,667.73
|
23,107.04
|
2014
|
172.24
|
211.23
|
36,382.25
|
29,666.62
|
44,618.11
|
TOTAL
|
591.05
|
909.44
|
92,737.51
|
69,985.91
|
130.187.40
|
Mathematically; correlation is
computed as;
r
=
OR
=
From the table above, we have the
following results.
Ã¥Xi = 591.05
Ã¥Yi = 909.44
Ã¥XY = 92737.51
Ã¥X2 = 69985.91
Ã¥Y2 = 130187.40
To compute for value of r
r =
=
=
= 389850.59
407950.89
r = 0.9556
or
=
Sxy = 38985.06
Sxx = 35051.90
Syy = 47479.29
=
= 38985.06
187.22 x 217.89
= 38985.06
40793.36
=
0.9556
Thus, it can be seen that using
Karl-Pearson’s product moment formula in the data for variable X and Y, r =
0.9556 which indicates that, it is highly correlated. Therefore, it signifies
that the variables were fairly distributed.
4.2.1 TEST
OF HYPOTHESIS FOR r – USING t - DISTRIBUTION
In order to know whether r = 0.9556
exist under text of hypothesis as stated thus; H0: r = 0.96 (there
is no relationship)
H1: r = 0.96 (there is
relationship)
Test at 0.05 level of significance.
CRITICAL REGION: Reject H0
if tcalculated is greater than ttabulated and accept if
otherwise.
From
the table, t-tabulated can be calculated as: t(1-a/2), n-2 (since it is
two-tailed test) at n – 2 degree of freedom.
t(1-a/2), n-2 = t(1-0.05/2),
10-2
=
t0.025, 8
ttab = 2.306
Also, tcalculated can be
shown below;
t = r Ön-2
where r = 0.9556
n
= 10
t
=
=
=
tcal = 9.1745 » 9.18
DECISION RULE: Reject H0
since tcalculated is greater than ttabulated i.e. (9.1745
>2.306).
CONCLUSION: Since tcal is
greater than ttab i.e. (9.1705 > 2.306) reject H0 and
accept H1. Then, it means that, there is relationship between the
two variables X and Y. hence r = 0.9556 exists.
4.2.2 COEFFICIENT
OF DETERMINATION
r2
=
r2
= 0.9556
r
=
=
0.9775
» 0.98
INTERPRETATION: This shows that 98% of
variation in the value of Y can be predicted by change in the value of X
learning 2% of the variable in Y to be explained in other ways or other
factors.
4.2.3 THE
SPEARMAN’S RANK CORRELATION CO-EFFICIENT
This
is another approach of computing correlation coefficient and it is given as:
r
=
where;
Ã¥di2 = 6
n = 10
r
= 1 - 6(6)
10(99)
=
1 – 36
10(99)
=
1 – 36
990
=
1 – 0.0364
=
0.9636
» 0.96
Where is the Sum Square different in rank Xi and rank
Yi
N is the number of sample taken
YEAR
|
Xi
|
Y
|
RX
|
RY
|
D
= (RX – RY)
|
d2
|
2005
|
2.80
|
5.28
|
2
|
1
|
1
|
1
|
2006
|
3.97
|
5.68
|
3
|
2
|
1
|
1
|
2007
|
2.22
|
10.33
|
1
|
3
|
-2
|
4
|
2008
|
9.71
|
47.58
|
4
|
4
|
0
|
0
|
2009
|
13.35
|
74.14
|
5
|
5
|
0
|
0
|
2010
|
63.50
|
119.44
|
6
|
6
|
0
|
0
|
2011
|
85.08
|
139.44
|
7
|
7
|
0
|
0
|
2012
|
105.26
|
144.31
|
8
|
8
|
0
|
0
|
2013
|
132.92
|
152.01
|
9
|
9
|
0
|
0
|
2014
|
172.24
|
211.23
|
10
|
10
|
0
|
0
|
Ã¥d2 = 6
|
Using the formula:
r =
where
Ã¥di2 = 6
n = 10
r =
r =
r =
r = 1 – 0.0364
=
0.9636
» 0.96
INTERPRETATION: This shows that sales
and advertising expenditure are highly and positively correlated.
From
this computational analysis, we can deduce that the application of both formula
has been used to confirm the story degree of relationship established between
sales and advertising expenditure of Betalmalt has been shown by the (0.9556)
and (0.9636) closer to each other.
This method is not applicable in this
field of study because it can only be used when the variables data are ranked
according to their magnitude.
4.3 ANALYSIS
OF VARIANCE FOR TESTING SIGNIFICANCE OF REGRESSION (ANOVA)
This is a statistical method used to
compare more than two sample means or a number pertinent variables in the same
experiment. Estimation of s2 shows that Syy = bSxy + SSE using
partition sum of square we shall have.
Syy = SSR + SSE
Where TSS = total sum of square
SSR = regression sum of square
SSE = Error sum of square or residual
We
know that SSR and SSE are values of independent variables with 1 and n – 2
degree of freedom and TSS is also a variable with n –1 degree of freedom.
The
test of hypothesis of interest is H1 i.e. w accept H1 which says there is a
linear relationship between the sales and advertising expenditure.
H0: b = 0
H1: b ¹ 0
At 0.05 level of significance = 5.32
CRITICAL REGION: We reject H0 at 0.05
level of significance when Fcal > Ftab i.e. F0.05,
1, n –2.
We compute these on practice;
TSS = Syy
SSR = bSxy
SS = TSS – SSR
F
=
SSR/I
SSE/n-1
Recall that:
SYY = 47479.29
bsxy = (1.11) (38985.06)
=
43273.42
SSE = TSS – SSR
=
47479.29 – 43273.42
SSE = 4205.87
4.3.1 ANALYSIS
OF VARIANCE TABLE (ANOVA TABLE)
SOURCE OF VARIATION
|
DEGREE OF FREEDOM
|
SUM OF SQUARE
|
MEAN SQUARE
|
F - RATION
|
REGRESSION
|
1
|
43273.42
|
432773.42
|
82.31111027
|
ERROR
|
8
|
4205.87
|
525.73
|
-
|
TOTAL
|
9
|
47479.29
|
-
|
-
|
DECISION RULES: We reject H0
since Fcal which is 82.3111 is greater that the Ftab i.e.
5.322. It means there is a linear relationship between the sales and
advertising expenditure and accept H1. Therefore, the regression
equation has significant effect at 0.05 level of significant.
CHAPTER FIVE
5.0 CONCLUSION
5.1 SUMMARY OF ANALYSIS
ESTIMATE FORMULA USED RESULT
1. Regression Computation b =
b = 1.11
of advertising on sale a =
= åYi - båXi
n n a
= 25.33
Y
= a + bX Y = 25.33 + 1.11Xi
2. Karl Pearson’s
Product
moment correlation coefficient r = r = 0.9556
or r =
3. Spearman’s Rank
correlation coefficient r
= r = 0.9636
4. Coefficient of
determination r
= r = 0.9775
5a. Test of b
(Beta) H0:
b = 0 tcal
= 8.97
H1:
b ¹ 0 ttab
= 2.306
t = b
- b0
S/ÖSxx
Reject
H0 if tcal > ttab and accept otherwise Reject H0 and conclude that
b¹0
b. Test of a
(Alfa) H0:
a = 0 tcal = 0.4881
H1:
a ¹ 0 ttab
= 2.306
t = a
- a0
S/ÖSxx
Reject
H0 if tcal > ttab and accept otherwise a ¹0
c. Test of r-test H0:
r = 0 tcal
= 9.1745
H1:
r ¹ 0 ttab
= 2.306
t
= r Ön-2
Reject
H0 if tcal > ttab and accept otherwise r = 0.9556 exists
5.2 FINDINGS
This project has been able to show the
relationship that exists between the sales and advertising expenditure of
international Breweries Plc, Ilesa in Osun State
within the period of 10 years 1993-2002
From the analysis so far in the
previous chapters and in relation to the aims and objectives of this project we
can conclude that there is a positive linear relationship between sales and
advertising expenditure under review that is, there is dependency between the
two variables x and Y.
When testing the hypothesis on regress
coefficient by the use of various approaches. We rejected the hypothesis that
the regression equation has no significant effect on sales and advertising
expenditure. While the alternative hypothesis was accepted which means that
regression equation has significant effect on both sales and advertising
expenditure. The conclusion thus, is that there exist a positive linear
relationship between sales and advertising expenditure.
The regression equation Y = 1.11x +
25.33 was estimated using the least square method. It was proved to be the best
fit for the linear relationship.
There future prediction can be
estimated in order to verity whether the economy remains stable.
We also calculated the correction
coefficient (r) using Karl Pearson’s product moment. This give 0.9556 which
indicate positive and strong relationship between the variable X and Y. when
rank correlation method was used, the result was 0.9636 and when the
coefficient of determination was used for explained and unexplained variation,
it gives 0.9775 that is 97.8% of the total variations could be explained
thereby leaving a smaller part of 2.25% unexplained.
Lastly, the execution of this project
exposes the writers to the challenges of a carrier in statistics. It also
reflects the stress involved in data collection and analysis. The need for
accuracy and unbiased interrelationship of figures cannot be overemphasizes.
Therefore “statistics” is not only theoretical course where formulae are the
order of the day but as a body of method and theory applied to numerical
evidence in making decision in the face to uncertainty.
5.3 RECOMMENDATIONS
Since
the main objective of international breweries plc, Ilesa is to generate
revenue. The data used in this project shows that the variables are linearly
positively correlated i.e. sales increases as the advertising expenditure
increase. Even the sales and advertising expenditure for the next five (i.2005-2014)
can be predicted and this
can shows than an increase in advertisement leads to
an increase in sales.
Based
on the analysis we have made so far on the original data (Raw data) used in
this project and having examined and analysed the value obtained from the
computation made the following recommendation are being made:
i.
That
the day-to-day activities of the organisation must be monitored very well and
they about the worker’s welfare. That is, to improve standard of living as a
way of motivating the workers in order to encourage them to put in their best
in there respective work.
ii.
The
organisation should take to budgetary policies that will ensure the true
relationship of advertising expenditure as a function of sales a corresponding
increase in sales viable and profit oriented economic project should be embank
upon to generate more revenue for the organisation. For example organizing Maltina
– Night at different areas, raffle draw
bonanza etc
iii.
Since
increase in advertising expenditure leads to a corresponding economic project
should be embark upon to generate more revenue for the organisation. For example: organizing Maltina -Night and
different areas, raffle draw bonanza, etc.
iv.
So
also, highly competent and well trained personel should be employed in order to
have higher productivity which in turn yield more profit for the company at
large. Moreover labour should be
employed in the area needed in order to reduce high rate of wastage.
v.
The
company statistical unit should be more equipped because of indispensability
saddled with responsibility of collecting, collating, organizing, summarizing,
analysis, presentation and interpretation of data pertaining to the activities
of the company. This will ensure the
availability of accurate data in all other areas of the company, and the
statistical unit will discover that sales and advertising expenditure are
positively correlated always, so as to increase and maintain maximum turn over
which definitely can lead to huge profit over a specified period of time.
REFERENCES
Agha, S.O (1998) The secret of the sciences: A Necessity for every science students:
Ebonyi State, Agba Family series
Abolede (1986) Managers ans schools Ile-ife; University of Ife press
Adama, S.O. & Tinuke Johnson
(1985). Statistics for Beginness Book
One, Eveans Brother Nigeria
Adebile O. A. & T. O. Ojo: Business Statistics (First Edition)
Adebile O. A.: Statistical Methods in Research (First Edition)
Chris Spatz, Basic Statistics Table of Distribution (10th Edition).
David F. Groebner and Patrick W.
Shannon (1981). Business statistics, A
Decision making Approval, Bell and Howell Company.
Donald H. Sandals (1990). Statistics, A. Fresh Approach, (4th
Edition) McGraw Hill Publishing Company,
New York.
Dopuglas C. Montyomerg George C.
Kunger, Applied Statistics and
probability.
Frances A.: Business Mathematics and Statistics
Frank Owen & Ron Jones: Statistics (Forth Edition)
Harper W.M. (1971). Statistics, McDonald and Evans Ltd, Britain.
Jirgba Emmanuel Yila (2015). Handout, Applied General Statistic.
Murray R. Spiegel (1972). Schaum’s Outline Series (1st
Edition).
Oyekunle J. O.: Applied General Statistics Note (STA 224)
Paul G. Hoel & Raymond Jessen
(1971). Basic Statistics for Business and
Economics, Mcgraw Hill Inc.
Qazi Zameeruddin V.K. Khanna SK
Bhambri by Business Math
T.
O. Ojo: Sampling Techniques Note (STA 222)
No comments:
Post a Comment