The attached document ‘Modelling Association Football Scores and Inefficiencies in the Football Betting Market’ (which is a subsection of a 1997 publication from the Royal Statistical Society) describes the derivation of a parametric model that utilises historical results data as the basis for predicting the outcome of future football matches between two given teams.
A simple Poisson regression model is proposed and then modified to incorporate provisions for the dependence of certain score combinations and the dynamic nature of team’ performances. The likelihood function is then maximised numerically and iterated further to incorporate a temporal parameter to produce a time-dependent rating mechanism.
The document describes the derivation of this model and provides output metrics based on a set of English league football results between 1992 and 1995. The utility of the model is then verified against data from the 1995/96 season.
We want to reproduce the derivation and test the validity of this model using data from results taken from the top four English divisions during the 2003/04 to 2009/10 seasons (the ‘sample data set’). However, the summarised nature of the document means that the detail of the derivation is often missing. We have broken the project into specific required outputs:
1. Page 268 – Table 1 – the calculation and method used to calculate the standard errors is not specified. However, the second table on the same page specifies the error method as ‘bootstrapping’. Please reproduce this table with the sample data set along with bootstrap errors and details of the method applied (i.e. number of trials, with/without replacement etc). If you use a statistical software package to calculate the errors, please advise which package and provide an example file.
2. Page 268 – Table 2 – reproduce the table for the sample data set along with boostrap standard errors and highlight all score combinations that do not satisfy the assumption of independence.
3. Page 271 – Equation 4.3 – the paragraph that immediately follows equations 4.3 and 4.4 discusses the ‘direct numerical maximisation’ of the proposed likelihood function (4.3). Adhering to the calibration constraint expressed earlier in section 4.2, reproduce Table 3 by team name (as opposed to aggregated by division). Please also provide an explanation along with supporting derivation of all formulae required to support solving equation 4.3.
4. Page 273 – Figure 1 – reproduce the graph detailed in figure 1 and identify the value of ? at which the function is maximised. Please provide supporting working.
5. Page 274 – Tables 4 & 5 provide snapshot values and standard errors as of 5th August 1995 for the Premier League and 2nd Division respectively. Figure 2 displays time-series data for a subset of teams for time periods 60 to 174. Please produce a complete time-series database for all teams by half-week period for all values of t where t >= 60 (setting t = 1 to the earliest time period for which there is results data). Please also provide details of how the database is compiled.
Ideally we want to be able to re-produce numerous, similar, iterated databases. Being able to generate the database from within a single software package is therefore extremely desirable.
Décerné à :
8 freelance ont fait une offre moyenne de 270 $ pour ce travail
Hi, We are a highly skilled team in stats and programming, with experience of similar projects. Look forward to the opportunity of serving you. Zack
I am a US based consultant with immediate availability to work on your project. I look forward to working with you on this project. See PM for details Regards, Paranos.