Mo, 07/21/2014 - 16:48 — Draketo
## Table of Contents

## 1 The challenge

## 2 The old script

## 3 Solution with Guile

## 4 The Answer

## 5 Conclusion

I needed to calculate the probability that for every freenet user there are at least 70 others in a distance of at most 0.01. That needs binomial coefficients with n and k on the order of 4000. My old Python script failed me with an `OverflowError: integer division result too large for a float`

. So I turned to Guile Scheme and exact math.

I need the probability that within 4000 random numbers between 0 and 1, at least 70 are below 0.02.

Then I need the probability that within 4000 random numbers, at most 5 find less than 70 others to which the distance is at most 0.02.

Or more exactly: I need to find the right maximum length to replace the 0.02.

I had a Python-script lying around which I once wrote for estimating the probability that a roleplaying group will have enough people to play in a given gaming night.

It’s called spielfaehig.py (german for “able to play”).

It just does this:

from math import factorial fac = factorial def nük(n, k): if k > n: return 0 return fac(n) / (fac(k)*fac(n-k)) def binom(p, n, k): return nük(n, k) * p** k * (1-p)**(n-k) def spielfähig(p, n, min_spieler): try: return sum([binom(p, n, k) for k in range(min_spieler, n+1)]) except ValueError: return 1.0

Now when I run this with p=0.02, n=4000 and min_{spieler}=70, it returns

OverflowError: integer division result too large for a float

The reason is simple: There are some intermediate numbers which are much larger than what a float can represent.

To fix this, I rewrote the script in Guile Scheme:

#!/usr/bin/env guile-2.0 !# (define-module (spielfaehig) #:export (spielfähig)) (use-modules (srfi srfi-1)) ; for iota with count and start (define (factorial n) (if (zero? n) 1 (* n (factorial (1- n))))) (define (nük n k) (if (> k n) 0 (/ (factorial n) (factorial k) (factorial (- n k))))) (define (binom p n k) (* (nük n k) (expt p k) (expt (- 1 p) (- n k)))) (define (spielfähig p n min_spieler) (apply + (map (lambda (k) (binom p n k)) (iota (1+ (- n min_spieler)) min_spieler))))

To use this with exact math, I just need to call it with p as exact number:

(use-modules (spielfaehig)) (spielfähig #e.03 4000 70) ; ^ note the #e - this means to use an exact representation ; of the number ; To make Guile show a float instead of some huge division, just ; convert the number to an inexact representation before showing it. (format #t "~A\n" (exact->inexact (spielfähig #e.03 4000 70)))

And that’s it. Automagic hassle-free exact math is at my fingertips.

It just works and uses less then 200 MiB of memory - even though the intermediate factorials return huge numbers. And huge means **huge**. It effortlessly handles numbers with a size on the order of 10^{8000}. That is 10 to the power of 8000 - a number with 8000 digits.

42! :)

The real answer is 0.0125: That’s the maximum length we need to choose for short links to get more than a 95% probability that in a network of 4000 nodes there are at most 5 nodes for which there are less than 70 peers with a distance of at most the maximum length.

If we can assume 5000 nodes, then 0.01 is enough. And since this is the number we directly got from an analysis of our link length distribution, it is the better choice, though it will mean that people with huge bandwidth cannot always max out their 100 connections.

Most of the time, floats are OK. But there are the times when you simply need exact math.

In these situations **Guile Scheme is a lifesaver**.

Dear GNU Hackers, thank you for this masterpiece!

And if you were crazy enough to read till here, **Happy Hacking** to you!

Anhang | Größe |
---|---|

2014-07-21-Mo-exact-math-to-the-rescue-guile-scheme.org | 4.41 KB |

- Druckversion
- Login to post comments