En cours

Statistical analysis of data set and formula solving

The attached document ‘Modelling Association Football Scores and Inefficiencies in the Football Betting Market’ (which is a subsection of a 1997 publication from the Royal Statistical Society) describes the derivation of a parametric model that utilises historical results data as the basis for predicting the outcome of future football matches between two given teams.

A simple Poisson regression model is proposed and then modified to incorporate provisions for the dependence of certain score combinations and the dynamic nature of team’ performances. The likelihood function is then maximised numerically and iterated further to incorporate a temporal parameter to produce a time-dependent rating mechanism.

The document describes the derivation of this model and provides output metrics based on a set of English league football results between 1992 and 1995. The utility of the model is then verified against data from the 1995/96 season.


We want to reproduce the derivation and test the validity of this model using data from results taken from the top four English divisions during the 2003/04 to 2009/10 seasons (the ‘sample data set’). However, the summarised nature of the document means that the detail of the derivation is often missing. We have broken the project into specific required outputs:

1. Page 268 – Table 1 – the calculation and method used to calculate the standard errors is not specified. However, the second table on the same page specifies the error method as ‘bootstrapping’. Please reproduce this table with the sample data set along with bootstrap errors and details of the method applied (i.e. number of trials, with/without replacement etc). If you use a statistical software package to calculate the errors, please advise which package and provide an example file.

2. Page 268 – Table 2 – reproduce the table for the sample data set along with boostrap standard errors and highlight all score combinations that do not satisfy the assumption of independence.

3. Page 271 – Equation 4.3 – the paragraph that immediately follows equations 4.3 and 4.4 discusses the ‘direct numerical maximisation’ of the proposed likelihood function (4.3). Adhering to the calibration constraint expressed earlier in section 4.2, reproduce Table 3 by team name (as opposed to aggregated by division). Please also provide an explanation along with supporting derivation of all formulae required to support solving equation 4.3.

4. Page 273 – Figure 1 – reproduce the graph detailed in figure 1 and identify the value of ? at which the function is maximised. Please provide supporting working.

5. Page 274 – Tables 4 & 5 provide snapshot values and standard errors as of 5th August 1995 for the Premier League and 2nd Division respectively. Figure 2 displays time-series data for a subset of teams for time periods 60 to 174. Please produce a complete time-series database for all teams by half-week period for all values of t where t >= 60 (setting t = 1 to the earliest time period for which there is results data). Please also provide details of how the database is compiled.

Ideally we want to be able to re-produce numerous, similar, iterated databases. Being able to generate the database from within a single software package is therefore extremely desirable.

Compétences : Excel, Mathématiques, Statistiques

Voir plus : data set formula, top statistics, top graph, subset test, standard bootstrap, sample graph data, rating statistics, matches statistics, market equations, graph top, graph sample data, graph explanation example, formula rating, football database software, team snapshot, bootstrap team page, time series, statistics calculation, statistical model, statistical database, solving, sample data, regression, regression test, regression model

Concernant l'employeur :
( 0 commentaires ) Los Angeles, United States

N° du projet : #2347311

Décerné à :


Hi, please take a look in my profile to check my ability in Math, Statistics and Programming. Let me help you to complete the project.

200 $ USD en 5 jours
(10 Commentaires)

8 freelance ont fait une offre moyenne de 270 $ pour ce travail


Please check your inbox. thanks

360 $ USD en 5 jours
(56 Commentaires)

See PM,please.

150 $ USD en 7 jours
(30 Commentaires)

Hi, We are a highly skilled team in stats and programming, with experience of similar projects. Look forward to the opportunity of serving you. Zack

600 $ USD en 15 jours
(2 Commentaires)

I am a US based consultant with immediate availability to work on your project. I look forward to working with you on this project. See PM for details Regards, Paranos.

250 $ USD en 3 jours
(1 Commentaire)

i try to help your problem, i experience in excel and statistical solution.

250 $ USD en 10 jours
(0 Commentaires)

Hi, I am a expert in statistics also profound knowledge in statistical softwares. Although i am a new freelancer but definitely rely on me. please give a chance to prove me. I have experience in mystatlab, mymathlab w Plus

100 $ USD en 10 jours
(0 Commentaires)

Hi dear, Contact me. I can help you. I am Statistician with more than six years experience in data analysis using SPSS, Minitab. Thank You..

200 $ USD en 4 jours
(0 Commentaires)

Dear Sir, I am interested for the project.

150 $ USD en 4 jours
(0 Commentaires)