Tuesday, April 14, 2020

Playing with Reproduction Value

Background

When the chart of infection spread for CoronaVirus was published it intrigued me that both the tall thin curve representing unrestricted spread and the short fat curve representing controlled spread were normal distribution bell curves.



I wanted to know why the curves rose and fell and in particular what made the curves fall in such a symmetrical way. Particularly what caused them to fall. I decided the best way was to make my own model so I can play with it.



When I started reading I realised Basic Reproduction Value, R0 (are-nought) was the most important factor. It's pretty well the only property of the spread of the virus that at this time we are able to control.

The tall thin curve is the number of hospitalisations with time with no intervention. The native property of the virus.

The short fat curve is the number of hospitalisations if the R0 value is attenuated by reducing the opportunity to spread. But why the bell? Why not an ever increasing exponential curve until everyone who will be hospitalised is?

Assumptions

The charts use hospitalisations. That's so the dotted line (available hospital resources) is relatable to the numbers. I have assumed that infections follow that curve. There is a skewing factor that Is not taken into account in those curves and in my calculations and that is that there should be more hospitalisations (though not infections) in as the more vulnerable will be hospitalised early.

I have used a simple model for the reproduction value iteration time. In reality the virus is infectious for a long period. I have seen number like 7-10 days in the incubation period and up to several weeks in the symptomatic period. I haven't modelled for this. For convenience I've called an iteration period a day.

The model uses a chance or opportunity factor. I have assumed that the opportunity factor is linear with the number of people who have been infected. There has been some talk of a small number of reinfections in South Korea. If infection does not confer immunity in the vast number of cases then all bets are off and we are all dead.

Population density will have a huge effect on R value (more specifically on oportunity factor.) I haven't even attempted to model how high density areas within the same population affect the overall shape of the curve. My speculation is that there will be a much longer tail as more isolated communities become infected over time.

The model has no 'fudge factors' to aim to give a preconceived result. The R based calculations are taken directly from the definition of R

Method

The model is very simple. Day zero there are n infectious cases. Which means on day two there will be n*R. The total infected is n+(n*R)* The next day the infections are the (total infected on the previous day) * RO. They are added to the running total. 

I then apply an opportunity factor which is the chance of encountering RO number of people who are do not have acquired immunity. Which is the population/the running total of infected people.

The model runs until the number of new infections is zero or given number of iteraterations or a target number of infections has been reached
.

I have pasted the code after the footnotes. It runs under Python 3. I use version 3.6 on various versions of Ubuntu. This has been tested on 18.04.

Results

I'm presenting the results in a standard form:

  • Parameters
  • Raw output
  • csv
  • chart
  • analysis
The charts are created using LibreOffice Calc

Run one

I wanted a picture of how the tall thin curve might look so I used the profile for infecting the whole population at R0=3.

The parameters for my first run were:

population = 70m
day zero infections = 1
iterations 100
R value = 3

I chose the R0 value because Niall Ferguson had revised his estimate of 200-500k deaths to 20k based on a R) value of "probably greater than 3."

Here are the results. 
If you want the comma separated values to load into a spreadsheet you can cut and paste these:

=================================================================

day,infected,todaysinfections, opportunityfactor
1,4.0,3.0,1.0
2,12.999999614285715,8.999999614285715,0.9999999571428572
3,39.99999382857178,26.999994214286065,0.9999998285714341
4,120.99993134287536,80.99993751430358,0.999999442857231
5,363.99932731491725,242.9993959720419,0.9999982857152666
6,1092.9937348617311,728.9944075468138,0.999994814295324
7,3279.942840759639,2186.9491058979074,0.999984400089502
8,9840.48283497705,6560.539994217413,0.9999531579594177
9,29519.33628960095,19678.8534546239,0.9998594359595003
10,88531.0014957269,59011.66520612595,0.9995783094815771
11,265342.09842229204,176811.09692656514,0.9987352856929181
12,793764.7356017698,528422.6371794777,0.9962094128796816
13,2361056.530291741,1567291.794689971,0.9886605180628318
14,6904340.644671642,4543284.114379901,0.9662706352815467
15,19189833.989471916,12285493.344800275,0.9013665765046909
16,45942461.21758762,26752627.228115708,0.7258595287218298
17,73525420.95292085,27582959.73533324,0.3436791397487483

==============================================================

The chart of iterations v infections is:


Definitely starting to get a bell curve. The exponential part works. And the opportunity factor starts to kick in but doesn't have chance to pull the curve down before the target infections is reached.

The code seems to work as expected although I was surprise the opportunity factor didn't get a chance to kick in.

Run 2

For this run I wanted to try to get the full bell curve. And I did with these parameters.

population = 70m
day zero infections = 1
iterations 100
R value = 2

Raw output:


I was quite pleased to see the bell in the results.

If you want to play with them in a spreadsheet, here are the comma separated values:

=====================================================================
day,infected,todaysinfections, opportunityfactor
1,3.0,2.0,1.0
2,6.999999885714286,3.9999998857142858,0.9999999714285714
3,14.999998971428605,7.999999085714319,0.999999914285716
4,30.99999394285784,15.999994971429238,0.9999998000000146
5,62.99997017143768,31.999976228579836,0.9999995714286578
6,126.99986594295245,63.999895771514765,0.9999991142861404
7,254.99942708660234,127.99956114364987,0.999998200001915
8,510.99762046489644,255.9981933782941,0.999996371436756
9,1022.9902769794999,511.9926565146035,0.9999927143197077
10,2046.9606399653685,1023.9703629858686,0.9999854001389004
11,4094.8415087068443,2047.8808687414755,0.9999707719908576
12,8190.363711912501,4095.5222032056563,0.9999415165498757
13,16380.449840583464,8190.086128670962,0.9998830090898299
14,32756.789266355827,16376.339425772363,0.9997660078594202
15,65494.14183438098,32737.35256802515,0.9995320601533376
16,130907.58776828786,65413.44593390688,0.9990643836880803
17,261489.82103606238,130582.23326777453,0.9981299058890245
18,521678.6934508844,260188.87241482205,0.9962644454137707
19,1038178.3031141586,516499.6096632741,0.9925474615221302
20,2055857.003245857,1017678.7001316986,0.9851688956697977
21,4031437.2359366487,1975580.2326907916,0.9706306285250591
22,7755042.680259095,3723605.444322446,0.9424080537723336
23,14377204.556859408,6622161.876600313,0.8892136902820129
24,24901294.90187841,10524090.345019003,0.7946113777591514
25,38461947.971023865,13560653.069145458,0.6442672299731655
26,50681279.27394551,12219331.302921642,0.4505436146996591
27,57425903.87736294,6744624.60341743,0.2759817389436356
28,59848977.1579086,2423073.2805456645,0.1796299588948152
29,60551739.29053585,702762.1326272473,0.14501462631559137
30,60741450.163351946,189710.87281609673,0.13497516727805925
31,60791634.38507288,50184.22172093443,0.13226501195211507
32,60804837.662555486,13203.277482604371,0.13154809449895885
33,60808306.413785264,3468.751229775726,0.13135947624920735
34,60809217.37669668,910.9629114187214,0.13130992266021052
35,60809456.589925475,239.21322879153985,0.1312969089043331
36,60809519.40420555,62.81428007674185,0.1312934915724932
37,60809535.898305126,16.494099571382108,0.13129259422563497
38,60809540.2294036,4.331098471276219,0.13129235859564106
39,60809541.366683334,1.137279731252975,0.1312922967228057
40,60809541.665315434,0.29863209891056264,0.13129228047595237

=================================================================

As you can see the run ended because there were no new infections. There were still ~10 million uninfected.

I had acheived my bell curve:


Run 3

For run 3 I wanted to find an R0 value that gave me bell curve and infected every one.

I could have added some goal seeking to my code but decided to do it by hand. I knew the value was greater than 2 and less than 3 so I did increments of 0.1. The sensitivity is greater than I expected so I had to go to a second decimal place.

I won't post all the results, just the the one that got me closest:

population = 70m
day zero infections = 1
iterations 100
R value = 2.56


csv:

========================================================
day,infected,todaysinfections, opportunityfactor
1,3.56,2.56,1.0
2,10.113599760325487,6.553599760325486,0.9999999634285714
3,26.89081296246121,16.777213202135727,0.9999998698057176
4,69.84046287418919,42.949649911727974,0.9999996301312434
5,179.79145851842884,109.95099564423965,0.999999016564816
6,461.26528843560914,281.4738299171803,0.9999974458363069
7,1181.8335550987078,720.5682666630987,0.9999934247815937
8,3026.457200181365,1844.6236450826568,0.9999831309492129
9,7748.489632671324,4722.032432489959,0.9999567791828545
10,19835.55473445807,12087.065101786746,0.9998893215766762
11,50769.67370390198,30934.11896944391,0.9997166492180792
12,129903.58341521795,79133.90971131597,0.9992747332328012
13,332110.4489872361,202206.86557201814,0.998144248808354
14,847304.0774886198,515193.6285013837,0.9952555793001824
15,2150235.418221515,1302931.3407328953,0.9878956703215912
16,5383280.8504639575,3233045.4322424424,0.9692823654539783
17,13023373.812327115,7640092.961863157,0.9230960021362292
18,28943162.735471915,15919788.9231448,0.8139518169667554
19,52846840.87116953,23903678.135697614,0.5865262609218297
20,67841990.3561394,14995149.484969873,0.24504514469757815
21,69025430.52783662,1183440.1716972135,0.030828723483722762
22,69067610.03309497,42179.50525835205,0.013922435316619703
23,69069048.30654654,1438.2734515715865,0.01331987095578611
24,69069097.27439271,48.96784616825324,0.0132993241921923
25,69069098.94147752,1.667084815835123,0.013298624651532727
26,69069098.99823245,0.05675493254555362,0.013298600836035396

=======================================================



This is quite a long post so I'll end the results there. 

Conclusions

Whether you call it 'herd immunity' or 'widespread acquired infection' the opportunity factor serves to form the expected bell curve in all cases where R0 is between 1.2 (results not shown here for brevity. No new cases iteration 158, 22602160 infected) and 2.56.

R0 of 2 or less leaves a large population of potential infections if the R0 is increased by removing restrictions unless there are no or very few infections when restrictions are relieved.**

Even with this simple model the theoretical expected and possibly the real world results for cases/deaths (we wont know until at least one country has fully played out their infection.)

Any discussion of comparison with actual infections is futile because there is no way of telling what they are except other models.

A more valid result might be to take the deaths and compare them to infections calculated by this model to get an estimate of mortality. This might become another blog post at some time maybe.

Comments

Most comments are welcome either here or by mentioning me on  twitter. I will answer as many and as soon as I can.

I will ignore stuff like "my old aunty died of CoronaVirus so you are wrong." Your anecdotes do not interest me.

Or anything which refers to something I didn't actually say.

If you are commenting please be specific. "Cobblers" or any other such stuff will be either ignored or published to show what an idiot I am dealing with.

If you have a set of parameters you wish me to run please send them. Only the parameters show in the results section please.

If you want me to tailor the model or submit change requests then you'd better be ready with some paypal. I might or might not improve, change, remove it at any time. My choice.

You may take what additional inferences from the results. They are not my responsibility.

You use the information here at your own risk. Any decisions you base on it are your problem not mibe.


Footnotes

*The brackets are extraneous but I know a lot of people have problems with operated precedence and I didn't want to use RPN although I'm probably one of the last people alive who can.

** I may revise my model so the R0 value can be changed in the course of the curve. That might  show any subsequent peak(s).


Code:

Also available on GitHub here

If anyone requires the results files or images I can zip them up and make them available.


=========================================================
#---- License ----
"""
This code is the property of David Sinfield (dsinfield6@gmail.com) It is  released under The GNU public license the short version of which is you may copy, 
distribute, amend but this section of header should remain intact.

The long version is at https://www.gnu.org/licenses/gpl-3.0.html
"""

#------------------------------------------------------------------------------------------------------------------------------
# This is a Transmissibility generator V1.0
# The parameters are prompted for on the command line. It takes no command line parameters.
#blog post description is at https://davidsdiamondsandrust.blogspot.com/2020/04/playing-with-reproduceability-value.html
#------------------------------------------------------------------------------------------------------------------------------

#usage
print ("This program models the number of people infected in over a number of days for a given R value.")
print ()
print ("Day Zero are the number of infections on Day 0")
print("The R value is the number of people each infected person can transmit to.")
print(" R may be a decimal value. The result on day N will be rounded up to an integer.")
print("The program assumes that R declines linearly with the number of infections as the chance of contacting R uninfected declines")
print ("the chance parameter calculation may be a little simplistic but is calculated as (population-infected)/population ")
print ("The definition of R does not include a time element so this assumes one iteration a day.")
#functions
#all the get functions require validity checking for input type numeric.
def getpopulation():
    ipopulation=input("Population(default 70m) - ")
    if len(ipopulation) > 0:
        population=int(ipopulation)
    else:
        print("70m used")
        population=70000000
        return population
    
def getdayzero():
    idayzero=input("Day Zero infections (default =1) - ")
    if len(idayzero)>0:
        dayzero=int(idayzero)
    else:
        print ("1 used")
        dayzero=int(1)
    return dayzero
    #ToDo validty check/default
def getdays():
    return int(input("Days - "))
      #ToDo validty check/default
def getrvalue():
    return float(input("R Value - "))
def gettarget():
    itarget=input("Target value (leave blank for population=target) - ")
    if len(itarget)>0:
        target=int(itarget)
    else:
        print(str(population)+" used")
        target=population
    return target
#end functions

#start procedural
population=getpopulation()
dayzero=getdayzero()
days=getdays()
rvalue=getrvalue()
target=gettarget()

infected=dayzero
newcases=dayzero
csv=["day,infected,todaysinfections, opportunityfactor\n"]
for day in range (1, days):
    # calculations
    chance=((population+1)-infected)/population
    todayscases=float((newcases*rvalue)*chance)
    infected=infected+todayscases
    newcases=float(todayscases)
    #show results
    print ("Chance: "+str(chance)+" Day number: " + str(day) + " - Infected: " + str(int(infected))+" Newcases today:"+str(int(todayscases+0.9)))
    csv.append(str(day)+","+str(infected)+","+str(todayscases)+","+str(chance)+"\n")
    #exit conditions
    if target > 0:
        if infected > target: 
            break
    if newcases<1:
        print("No new cases")
        break
print("Program end")

#Option to write a file of data
dofile=input("Enter y to write a result file: ")
if dofile=="y" or dofile== "Y":
    filename=input("Filename (exension .csv will be added): ")
    if len(filename)==0:
        print ("No Filename results.csv will be used")
        f=open("results.csv","a")
    else:
        f=open(filename+".csv", "a")
    print ("Writing "+ f.name)
    for x in csv:
        f.write(x)
    f.close
# end procedural
 

===============================================