bug-gplusplus
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gcc 2.95.2 -O2 can get incorrect loop count


From: Geoff Kuenning
Subject: gcc 2.95.2 -O2 can get incorrect loop count
Date: Mon, 23 Apr 2001 00:52:13 -0700

The enclosed program is derived from my implementation of the Mersenne
Twist PRNG.  When compiled without optimization or with -O1, it
calculates correct results.  Using egcs 2.95.2 with -O2 or -O3,
generating -m486 code, it generates incorrect output.  I am compiling
on a Pentium running the SuSE 6.4 distribution, not that I think it
matters.  /proc/cpuinfo reads:

        processor       : 0
        vendor_id       : GenuineIntel
        cpu family      : 6
        model           : 6
        model name      : Mobile Pentium II
        stepping        : 10
        cpu MHz         : 363.966
        cache size      : 256 KB
        fdiv_bug        : no
        hlt_bug         : no
        sep_bug         : no
        f00f_bug        : no
        coma_bug        : no
        fpu             : yes
        fpu_exception   : yes
        cpuid level     : 2
        wp              : yes
        flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca 
cmov pat pse36 mmx fxsr
        bogomips        : 725.81

The cause is the loop at WORK_AROUND_OPTIMIZER_BUG (the #else version
is the code that induces the bug).  Examining the generated assembly
code, the optimizer apparently notices that there are references to
statevec[i-1] and statevec[i], and decides to convert these into
references to statevec[i] and statevec[i+1], with a corresponding
adjustment in the value of i.  Unfortunately, although it correctly
adjusts the initial value of the loop index, it does not make a
corresponding adjustment in the test value.  The result is that the
loop exits one pass too early.

The second loop, after the #endif, is compiled correctly.

If WORK_AROUND_OPTIMIZER_BUG is defined, the source code implements
the same modification that the optimizer is trying to make, but it
does so correctly (adjusting the test value, and also compensating for
the off-by-one value of i, between the first and second loops).  The
optimizer then stays out of the way and generates correct code.

I played a bit with trying to simplify the code, but the bug went
away, and I didn't have a lot of time to try different combinations.
I now suspect that one could discard the bit-shifting code, but that
the second loop may be necessary to reproduce the problem.
-- 
    Geoff Kuenning   address@hidden   http://www.cs.hmc.edu/~geoff/

"Du kannst dem Leben nicht mehr Tage geben, aber den Tag mehr Leben."
        -- source unknown, attribution requested (Goethe?)
(You can't give your life more days, but you can give your days more life.)

------------------------------cut here------------------------------
#undef WORK_AROUND_OPTIMIZER_BUG

#include <stdio.h>

#define MT_STATE_SIZE   624             /* Size of the MT state vector */

typedef struct
    {
    unsigned long       statevec[MT_STATE_SIZE];
                                        /* Vector holding current state */
    int                 stateptr;       /* Next state entry to be used */
    int                 initialized;    /* NZ if state was initialized */
    }
                        mt_state;

/*
 * Table of contents:
 */
void                    mts_seed32(mt_state* state, unsigned long seed);
                                        /* Set random seed for any generator */
unsigned long           mt_lrand(void); /* Generate 32-bit random value */

/*
 * The following values are fundamental parameters of the algorithm.
 * With the exception of the two masks, all of them were found
 * experimentally using methods described in Matsumoto and Nishimura's
 * paper.  They are exceedingly magic; don't change them.
 */

/* MT_STATE_SIZE is defined in the header file. */
#define RECURRENCE_OFFSET 397           /* Offset into state space for the */
                                        /* ..recurrence relation.  The */
                                        /* ..recurrence mashes together two */
                                        /* ..values that are separated by */
                                        /* ..this offset in the state */
                                        /* ..space. */
#define MATRIX_A        0x9908b0df      /* Constant vector A for the */
                                        /* ..recurrence relation.  The */
                                        /* ..mashed-together value is */
                                        /* ..multiplied by this vector to */
                                        /* ..get a new value that will be */
                                        /* ..stored into the state space. */

/*
 * Width of a long.  Don't change this even if your longs are 64 bits.
 */
#define BIT_WIDTH       32              /* Work with 32-bit words */

/*
 * Masks for extracting the bits to be mashed together.  The widths of these
 * masks are also fundamental parameters of the algorithm, determined
 * experimentally -- but of course the masks themselves are simply bit
 * selectors.
 */
#define UPPER_MASK      0x80000000      /* Most significant w-r bits */
#define LOWER_MASK      0x7fffffff      /* Least significant r bits */

/*
 * Tempering parameters.  These are perhaps the most magic of all the magic
 * values in the algorithm.  The values are again experimentally determined.
 * The values generated by the recurrence relation (constants above) are not
 * equidistributed in 623-space.  For some reason, the tempering process
 * produces that effect.  Don't ask me why.  Read the paper if you can
 * understand the math.  Or just trust these magic numbers.
 */
#define TEMPERING_MASK_B 0x9d2c5680
#define TEMPERING_MASK_C 0xefc60000
#define TEMPERING_SHIFT_U(y) \
                        (y >> 11)
#define TEMPERING_SHIFT_S(y) \
                        (y << 7)
#define TEMPERING_SHIFT_T(y) \
                        (y << 15)
#define TEMPERING_SHIFT_L(y) \
                        (y >> 18)

/*
 * Parameters of Knuth's PRNG (Line 25, Table 1, p. 102 of "The Art of
 * Computer Programming, Vol. 2, 2nd ed, 1981).
 */
#define KNUTH_MULTIPLIER \
                        69069

/*
 * Default 32-bit random seed if mts_seed32 wasn't called
 */
#define DEFAULT_SEED32  4357

/*
 * Many applications need only a single PRNG, so it's a nuisance to have to
 * specify a state.  For those applications, we will provide a default
 * state, and functions to use it.
 */
mt_state                mt_default_state;

/*
 * To generate double-precision random numbers, we need to divide the result
 * of mts_lrand or mts_llrand by 2^32 or 2^64, respectively.  The quickest
 * way to do that on most machines is to multiply by the inverses of those
 * numbers.  However, I don't trust the compiler to correctly convert the
 * corresponding decimal constant.  So we will compute the correct number at
 * run time as part of initialization, which will produce a nice exact
 * result.
 */
static double           mt_32_to_double;
                                        /* Multiplier to convert long to dbl */
static double           mt_64_to_double;
                                        /* Mult'r to cvt long long to dbl */

/*
 * In the recurrence relation, the new value is XORed with MATRIX_A only if
 * the lower bit is nonzero.  Since most modern machines don't like to
 * branch, it's vastly faster to handle this decision by indexing into an
 * array.  The chosen bit is used as an index into the following vector,
 * which produces either zero or MATRIX_A and thus the desired effect.
 */
static unsigned long    matrix_decider[2] =
                          {0x0, MATRIX_A};

static unsigned long    correct_values[] =
    {
    3510405877u, 4290933890u, 2191955339u,  564929546u,  152112058u,
    4262624192u, 2687398418u,  268830360u, 1763988213u,  578848526u,
    4212814465u, 3596577449u, 4146913070u,  950422373u, 1908844540u,
    1452005258u, 3029421110u,  142578355u, 1583761762u, 1816660702u,
    2530498888u, 1339965000u, 3874409922u, 3044234909u, 1962617717u,
    2324289180u,  310281170u,  981016607u,  908202274u, 3371937721u,
    2244849493u,  675678546u, 3196822098u, 1040470160u, 3059612017u,
    3055400130u, 2826830282u, 2884538137u, 3090587696u, 2262235068u,
    3506294894u, 2080537739u, 1636797501u, 4292933080u, 2037904983u,
    2465694618u, 1249751105u,   30084166u,  112252926u, 1333718913u,
     880414402u,  334691897u, 3337628481u,   17084333u, 1070118630u,
    2111543209u, 1129029736u, 2769716594u,  198749844u, 2123740404u,
    3372116884u,  667945179u, 1235233343u, 1413475797u, 2355129622u,
    3131889314u, 1361507145u, 3419810344u, 3753862504u, 2643520359u,
     854777807u, 2823672895u, 1563137348u, 2778071293u, 3360232580u,
    1294979669u, 2667002587u, 4095439979u,  806669383u, 3717814038u,
    1071953403u, 3630859637u,  771361748u, 1131385020u, 3697612515u,
    4264844916u, 1364828348u, 2863894246u, 1109863328u, 3066937191u,
    2778683115u, 3989613953u,  859474495u,  167522376u, 1094225558u,
    1963711766u, 1257324636u, 1729949323u,  125753755u, 1068698284u,
    1761594045u, 2106609220u, 1033190229u,  946183933u, 1100436279u,
     489306665u, 2385045156u,  658699819u, 3308017340u, 2385997510u,
     105622857u, 1976233741u, 1535497010u,  314176889u, 1247500738u,
    2138106664u, 2757078771u, 2433460811u, 3970906471u, 1944201130u,
    2366336502u, 2541539915u, 4284982935u,  224898482u, 1457988276u,
    2188154237u, 3655119144u,  979144237u, 1279857832u, 2711710163u,
    4093937260u, 2299893420u, 2121869254u, 2505459837u, 1847263294u,
    2974457633u, 3439803738u, 1672773492u,  225684598u, 2496857387u,
     611619631u, 3993022877u, 3508398016u, 1663733971u,  758566080u,
    1653863133u, 1701066037u, 3782883361u, 3343813716u, 2602577666u,
     629577870u, 3724093470u,  361301181u, 2674434977u, 1443899064u,
    4104796692u,  140754167u, 4230942998u,  447458027u,  743917836u,
    2818315151u, 1332110941u,  266033703u,  308176090u,  130356650u,
    3074197472u, 3093392044u, 1602712035u, 1856287253u, 3375970245u,
    3114872063u,  370413632u, 1435286098u,  645351481u,  908917276u,
    2440368438u, 1599412846u, 2500100729u,  790131292u, 3684521490u,
    4052663936u, 1566488352u,  295035953u, 4163443199u, 2997925650u,
     678068969u, 3667661122u, 1273385972u, 3555794914u,  762012623u,
    1091375739u, 1765680277u, 4238179361u, 3406590822u, 3797246256u,
     388618876u, 2160693563u,  349961032u, 1392360403u, 3579025090u,
     604490212u, 3577823930u, 2321462651u, 3790422055u, 3185735941u,
    2229660586u, 1128575466u,  884862908u, 3330757128u, 3194168238u,
     681073659u,  800341641u, 2006864675u, 2619636221u, 1577889182u,
    3575155952u, 2260116784u, 3689141878u, 1657848984u, 2642380231u,
    2878484564u, 3099953080u,  468844464u, 4134363631u, 4110639840u,
    1904558184u, 1937194892u, 3137602491u, 3643695698u,  820242909u,
    3068372731u, 3388748923u, 4235603065u, 2615115373u, 1555770904u,
    1520951990u, 1047053830u, 1663094463u, 3948465044u, 1077597088u,
    2996944317u, 3230974609u, 2019388652u,  865275198u, 2046271241u,
    3163620630u,  269072378u, 2253440754u,  277763586u, 1114595358u,
    2679817222u,  988885471u,  922982460u,  477046426u, 1161679484u,
    2950091998u, 2088879880u, 3516906501u, 1403601752u, 1184069822u,
    3531029460u, 3041313352u, 3359174170u, 2117925252u, 1529768389u,
    1253051929u, 1829668776u, 2678766355u, 2138924913u, 3147299808u,
     447212163u, 1069119222u, 3791704659u,  804904386u, 3580345412u,
    1700215583u,  904016717u,   74539975u, 1288558083u, 1746888271u,
     437611701u,  781576281u, 1166365552u, 2566333668u, 4292856916u,
    4076802480u, 3968908047u, 3057346051u, 2074623641u, 3653637364u,
     586217310u, 3211306229u, 2604443978u, 3303224219u,   50953535u,
    2118693299u,  460196852u, 3355264342u,  500727351u, 1681849672u,
     113995187u, 1200937601u, 3092561862u,  312936130u, 3916083024u,
    2561711416u, 3713941700u, 2240434908u,  637144400u, 1946468041u,
    3224254139u, 2242555323u, 2524984519u,  945834669u, 2805199117u,
    1879274691u, 3792160120u,   37879558u, 1205870756u, 3508020184u,
    2705733735u, 3134213377u, 3170077556u,    9055729u, 2147895752u,
    3788046325u, 2422273092u, 1228026268u, 1275162816u, 1385450594u,
     128490357u,  678715088u, 2466464403u, 2715463741u,  420067866u,
     396072989u, 1729810791u, 4146068843u,  389936193u, 1010939382u,
     794725144u,  633726173u, 3567793282u,  248988111u, 1254549356u,
    1989964616u, 1425389196u, 2202358947u, 2830413406u, 3798927382u,
    3442966168u,  269966252u, 2807646342u, 1794971636u, 1602977220u,
    1360381486u, 1805272988u,  554730358u, 4259897540u, 2168740091u,
     442942697u, 3744617621u, 3567326073u, 1232342342u, 3720210842u,
     303505483u, 3533762586u, 2136482547u, 2813945830u, 3051833277u,
    2830164695u, 3157472512u, 3186113013u, 1063305342u, 4133844231u,
    2026873438u,   94687021u,   42670911u, 2205451721u,  200274097u,
    3184342380u, 1216199867u, 1941807482u, 1902652464u, 4140258655u,
    1934749011u, 2626251395u, 1041046135u, 3256560316u, 2005577300u,
     367040511u, 2562291571u, 4179257085u,  394550850u, 3714400922u,
    3096861476u, 3729796880u, 3528309113u, 1483426196u,  821348673u,
    2057501468u, 3141389956u,  631596202u, 2325651773u, 3568751628u,
    2300178836u,  257777076u, 3455675689u, 2853908813u,  223950412u,
    2320715588u,  250378269u, 3557833937u,  927098460u,  315421373u,
    1957400381u,  725633141u, 3345409284u, 1114863797u, 1830809043u,
    1922707457u, 3785762071u, 3404751487u, 2434832700u, 1359584120u,
     860423718u, 2979956529u,  888063953u,  641814761u,  180527770u,
    3316862637u, 4012959929u,   28566252u, 1045558574u, 4131606737u,
     215968520u,  362234156u, 2248115936u, 3069089212u, 3403600809u,
    1812984601u, 1768934064u, 4007586160u, 1240758160u,   13887765u,
    3514615109u, 3827374039u, 3341198715u, 1982839159u, 2033151304u,
    4283902822u, 3744190534u,  726223056u, 3905246635u, 1875803225u,
    2030522753u, 2822074688u, 3325482280u, 1691268105u, 3033866845u,
    1716543028u, 1555574049u, 2661093496u, 1979855811u, 2251933935u,
    1276056752u, 3341241268u, 1892612984u, 2194846054u,  486586963u,
    2492823590u,  593230942u,  775986230u, 1255789287u, 2318099602u,
    3056263080u,  158332807u, 2451929550u, 3374135491u,  372847709u,
    1128359579u,  373993639u, 2419119952u,  829613207u, 2948211163u,
    2324165819u,  160722663u, 1444930279u,  765460462u, 3780495422u,
     592264489u,  316670611u, 2342138965u, 1439591408u, 3362218290u,
    2860902653u, 1116562887u, 3033679152u, 1381679779u, 1291533463u,
    1962666710u, 2222373514u, 1215751045u, 3569236064u, 1611254503u,
    1171727980u, 3057484484u, 3263787664u, 3065712276u,  781477153u,
    1881176626u, 3769042770u, 1193467712u, 2843905090u, 3437666358u,
    2160717604u,  419274206u, 2767079897u, 4007363510u, 1325555048u,
    4187736634u, 4066907010u, 3299749742u,  154249616u, 1941800223u,
    1157563946u,  705258893u,  860757803u, 1310317854u, 1898329708u,
    3491229573u,  312822728u, 3407001878u, 1029352733u, 3463910352u,
    2741163037u, 2705583812u, 2644345635u,  683012156u,  948328240u,
    2656867161u, 1644147624u, 2853875853u,  822059867u,  753937406u,
    1604103884u, 1756360543u, 3400647193u, 2802766030u, 2268191056u,
    3784643944u, 3009927237u,  394432056u, 1840177440u, 2651765924u,
    1205254585u, 3551482241u, 2857937506u, 2522509113u, 3675764066u,
    1234994787u, 3459183960u,  186529857u, 1329960799u, 2322397450u,
    1078548606u,  113242357u, 2421327506u, 3306100881u,  211880652u,
     847202265u, 1034020264u, 2374075486u,  755993425u, 2474409905u,
    1885945103u, 3588026819u, 3326201431u,  747273957u, 3172561912u,
    4064603659u, 2036147813u, 3539583488u, 2861164857u, 3303878586u,
    2385840167u,  734771685u,  804646316u, 1158163327u, 2080435695u,
    2455811362u, 2060318701u, 1223319334u, 2573853731u, 3341336861u,
    3207344772u, 1654544724u, 1227774824u, 1779567885u, 4241425455u,
    3942578957u, 2787959909u,   90390016u, 1235487669u, 2405269696u,
    4096898260u, 2121644059u, 2432007152u, 2649006803u,  355642145u,
    3998121079u, 3524581383u,  919279000u,  789586853u, 2465492036u,
     220701650u, 2947224279u, 1329892616u, 3729919501u, 1219039514u,
    3006882189u,  281458735u, 3800499491u,  730882493u, 2222118351u,
    4107301035u, 2687550208u, 3260886108u, 3701859392u, 3862191362u,
    1412535194u, 1694757183u,  772470279u,  689128388u, 2265554314u,
    3499902942u, 2845535450u, 2570802968u, 1947307958u, 4027367903u,
    1910806179u, 3858889779u, 4021735452u, 4078787932u, 3413032217u,
    2980053565u, 2148809521u, 1497338125u, 2040525958u, 3112062074u,
    2201881275u, 3672285015u, 3003099464u, 3809697025u, 3709177038u,
    3264926503u, 1707537043u, 1104985157u, 3691862420u, 1898692192u,
    2323051094u,  736562662u, 2300313916u, 3100486674u,  302436891u,
    1288467611u, 1981059467u, 3645778636u, 1609399619u, 3418162375u,
    3667843152u,  536318325u, 3040655354u, 2544467931u,  424476355u,
    1601741596u,  246832728u, 2677287644u, 2835422307u, 1127786001u,
     850991343u, 2067806155u, 3766212150u, 1001927094u, 3801068881u,
    3242308737u, 1560596676u, 1681618985u, 2562998394u, 3484825522u,
    2655391996u, 1746361319u, 3550836123u, 4038742826u,  735255465u,
    3954076606u, 4101370050u, 3167012089u, 1966689481u, 1576556523u,
     207036082u, 2337346669u, 3404885158u, 3372194402u, 2313772874u,
    1511011541u, 3934253759u,  152017013u, 2305096656u, 1003233114u,
    3788182044u, 2738083629u, 2667318735u, 4075851512u, 3919952624u,
    3934687504u,  955697805u, 3721361893u, 1892740917u, 1925356403u,
    3530645689u,   57355987u,  500963211u, 3812263275u, 3120702996u,
    3348394440u, 2648242605u, 2950965560u, 2906248872u, 4195607563u,
    1976120064u,  569029796u, 3204894397u,  836657553u, 4253104557u,
    4029524248u, 1167446730u, 3164256694u, 1225943621u,  978942573u,
    3887954057u, 4029693733u, 3371611138u, 2648127182u,  341670719u,
     780349063u, 1249088385u, 2825987206u, 1409751402u, 3493141543u,
    2454446995u, 3001899542u,  894695004u, 4113594037u, 2748119359u,
    3811278462u,  337072564u, 3551268535u, 4210316453u, 2857304716u,
    1656016234u, 3055850193u, 4074141119u, 2702683976u, 3903520288u,
    2708109896u, 1303194166u,  676764765u,    1073839u, 3417024471u,
     530027902u,  664548902u, 3934189521u, 1118172394u, 1598501076u,
    1353136139u, 3556356767u, 3851436279u,  787984702u, 3614996657u,
    2653843342u,  350845053u, 2540767452u,  341795141u, 4131579558u,
    2852231303u,  347703279u,  304754275u, 3637218358u, 1191420956u,
    4250273882u, 2217329477u, 3619012484u, 2320390083u, 1618600250u,
     100602741u, 3962829626u, 3325838530u, 3310041575u, 2202357234u,
    2410265700u, 1855854724u, 1586666379u, 1893433651u, 4212894970u,
    3078470962u, 3005791950u, 3645097109u, 2729330720u,  178175659u,
    1878759843u, 3613064024u, 2235022317u, 1229007963u, 1217716121u,
    3424643385u, 2025817426u, 2541310454u, 3491127040u,  834075061u,
    1476080952u,  527792572u, 3142617040u,  990164480u, 3538861805u,
    1101804820u,  254185979u, 2139277356u, 3053978085u, 3636278570u,
    1588526078u, 3265686058u, 3200724466u, 3305433961u, 1714292212u,
    2894641386u,  286242900u, 2390694965u, 1104137642u, 1729447649u,
    2603147116u, 1739535876u, 2332325654u, 3923517970u,  975963350u,
    3046750553u, 4287139816u, 2887426453u, 3205373337u, 2829120066u,
    3989557087u, 1090404329u, 2762938959u, 2016187695u,  188074317u,
     562585328u, 1185556641u, 3980179851u, 2922039667u, 3215853289u,
     946979886u, 2822925104u,  803185936u, 1852384677u, 2297156223u,
    2799487985u, 1756284627u, 2317738144u, 1335401093u,  674337096u,
    2899555797u, 3996541014u, 3014749938u,  283034558u, 3481962136u,
    3810308241u, 2725662577u,  744192370u, 3665583468u,  571653166u,
    3181462357u, 1782231829u,  563269281u, 1274600369u, 2386970918u,
     363976120u, 2931953430u,  614822572u, 1661454733u,  449226294u,
    3506516244u, 3427050773u, 2620464687u,  505585754u, 3256549852u,
     321182353u, 3890429063u, 2030404657u, 1191957414u, 3469037079u,
     947528907u,  578915297u, 2155016115u,  557136361u,  374100253u,
    1716551606u, 1064892772u, 2282836509u, 2205887435u,  735861059u,
    2088833463u, 3884993958u, 1520195389u, 3349226581u, 4116948321u,
    3654190067u, 3585109483u, 2170692034u, 4104050396u, 1431760441u,
     871040163u, 3131221596u, 4039637966u, 3255680597u, 1537348827u,
    4274038615u, 2766931218u, 3080056590u, 2427576150u, 1698622741u,
     491269657u, 2957152254u, 4003327323u, 1665870563u, 2898998885u,
     350910601u, 3377912239u,   56558544u, 2757732126u, 3003019578u,
    3020476484u, 1103562767u,  266867533u, 3238470691u,  101050052u,
    2174141449u, 4261294044u, 3377977245u, 4283033233u, 1649319944u,
    2407542500u, 1186270608u,  506638155u,  853299114u,  528018288u,
    3370360190u, 3432726533u,   33304936u,  746516661u,  864441540u,
     877624554u,  549536499u, 2512486873u, 1718378202u, 1877264374u,
    3907125156u, 3480301523u, 1222420918u, 2962412787u, 1897386667u,
     565602854u,  870635873u, 1837265416u, 2093005006u, 2069832336u,
     421542850u,  546177829u, 3737295211u, 2022730805u,  114777545u,
    1045010491u, 1474827040u,    7636789u, 3926378049u,  825140914u,
    3894878768u, 2235348820u, 4209167372u, 1363271078u, 1633012431u,
    2604598105u, 1475584955u,  363136592u, 2939993958u, 3395026321u,
    2769982499u, 2579982422u, 3377853512u, 1926697291u, 1309179303u
    };

int main(int argc, char *argv[])
    {
    long                i;
    unsigned long       random_value;
    int                 success = 0;    /* Note that zero = success */

    for (i = 0;  i < sizeof correct_values / sizeof correct_values[0];  i++)
        {
        random_value = mt_lrand();
        if (random_value != correct_values[i])
            {
            (void) fprintf (stderr,
              "Failure at value %ld: expected %lu, got %lu\n", i,
                correct_values[i], random_value);
            success = 1;
            break;
            }
        }
    return success;
    }

/*
 * Initialize a Mersenne Twist PRNG from a 32-bit seed.
 *
 * According to Matsumoto and Nishimura's paper, the seed array needs to be
 * filled with nonzero values.  (My own interpretation is that there needs
 * to be at least one nonzero value).  They suggest using Knuth's PRNG from
 * Line 25, Table 1, p.102, "The Art of Computer Programming," Vol. 2 (2nd
 * ed.), 1981.  I find that rather odd, since that particular PRNG is
 * sensitive to having an initial seed of zero (there are many other PRNGs
 * out there that have an additive component, so that a seed of zero does
 * not generate a repeating-zero sequence).  However, one thing I learned
 * from reading Knuth is that you shouldn't second-guess mathematicians
 * about PRNGs.  Also, by following M & N's approach, we will be compatible
 * with other implementations.  So I'm going to stick with their version,
 * with the single addition that a zero seed will be changed to their
 * default seed.
 */
void mts_seed32(
    mt_state*           state,          /* State vector to initialize */
    unsigned long       seed)           /* 32-bit seed to start from */
    {
    int                 i;              /* Loop index */

    if (seed == 0)
        seed = DEFAULT_SEED32;

    /*
     * Fill the state vector using Knuth's PRNG.  Be sure to mask down
     * to 32 bits in case we're running on a machine with 64-bit
     * longs.
     */
    state->statevec[MT_STATE_SIZE - 1] = seed & 0xffffffff;
    for (i = MT_STATE_SIZE - 2;  i >= 0;  i--)
        state->statevec[i] =
          (KNUTH_MULTIPLIER * state->statevec[i + 1]) & 0xffffffff;

    state->stateptr = MT_STATE_SIZE;
    state->initialized = 1;

    /*
     * Figure out the proper multiplier for long-to-double conversion.  We
     * don't worry too much about efficiency, since the assumption is that
     * initialization is vastly rarer than generation of random numbers.
     */
    mt_32_to_double = 1.0;
    for (i = 0;  i < BIT_WIDTH;  i++)
        mt_32_to_double /= 2.0;
    mt_64_to_double = mt_32_to_double;
    for (i = 0;  i < BIT_WIDTH;  i++)
        mt_64_to_double /= 2.0;
    }

/*
 * Generate a random number in the range 0 to 2^32-1, inclusive, working
 * from the default state vector.
 *
 * See mts_lrand for full commentary.
 */
unsigned long mt_lrand()
    {
    register unsigned long
                        random_value;   /* Pseudorandom value generated */

    /*
     * See if we have used up the last group and need to generate more.
     */
    if (mt_default_state.stateptr == 0)
        {
        int             i;              /* Index into the state */

        /*
         * It's time to generate MT_STATE_SIZE new pseudorandom words.
         * Start by making sure a random seed has been set.  If not, set
         * one.
         */
        if (!mt_default_state.initialized)
            mts_seed32(&mt_default_state, DEFAULT_SEED32);

        /*
         * Now generate the new pseudorandom values by applying the
         * recurrence relation.  We use two loops and a final
         * 2-statement sequence so that we can handle the wraparound
         * explicitly, rather than having to use the relatively slow
         * modulus operator.
         *
         * In essence, the recurrence relation concatenates bits
         * chosen from the current random value (last time around)
         * with the immediately preceding one.  Then it
         * matrix-multiplies the concatenated bits with a value
         * RECURRENCE_OFFSET away and a constant matrix.  The matrix
         * multiplication reduces to a shift and two XORs.
         */
#ifdef WORK_AROUND_OPTIMIZER_BUG
        for (i = MT_STATE_SIZE - 1;  --i >= RECURRENCE_OFFSET - 1;  )
            {
            random_value = (mt_default_state.statevec[i + 1] & UPPER_MASK)
              | (mt_default_state.statevec[i] & LOWER_MASK);
            mt_default_state.statevec[i + 1] = mt_default_state.statevec[i + 1 
- RECURRENCE_OFFSET]
              ^ (random_value >> 1) ^ matrix_decider[random_value & 0x1];
            }
        i++;
#else
        for (i = MT_STATE_SIZE;  --i >= RECURRENCE_OFFSET;  )
            {
            random_value = (mt_default_state.statevec[i] & UPPER_MASK)
              | (mt_default_state.statevec[i - 1] & LOWER_MASK);
            mt_default_state.statevec[i] = mt_default_state.statevec[i - 
RECURRENCE_OFFSET]
              ^ (random_value >> 1) ^ matrix_decider[random_value & 0x1];
            }
#endif
        for (  ;  --i >= 0;  )
            {
            random_value = (mt_default_state.statevec[i + 1] & UPPER_MASK)
              | (mt_default_state.statevec[i] & LOWER_MASK);
            mt_default_state.statevec[i + 1] =
              mt_default_state.statevec[i + 1 + MT_STATE_SIZE - 
RECURRENCE_OFFSET]
                ^ (random_value >> 1) ^ matrix_decider[random_value & 0x1];
            }
        /*
         * The final entry in the table requires the "previous" value
         * to be gotten from the other end of the state vector, so it
         * must be handled specially.
         */
        random_value = (mt_default_state.statevec[0] & UPPER_MASK)
          | (mt_default_state.statevec[MT_STATE_SIZE - 1] & LOWER_MASK);
        mt_default_state.statevec[0] =
          mt_default_state.statevec[MT_STATE_SIZE - RECURRENCE_OFFSET]
            ^ (random_value >> 1) ^ matrix_decider[random_value & 0x1];

        mt_default_state.stateptr = MT_STATE_SIZE;
        }

    /*
     * The state vector contains at least one usable random number.
     * Pick it up, and use the tempering transformation to generate
     * good distribution properties.
     */
    random_value = mt_default_state.statevec[--mt_default_state.stateptr];
    random_value ^= TEMPERING_SHIFT_U(random_value);
    random_value ^= TEMPERING_SHIFT_S(random_value) & TEMPERING_MASK_B;
    random_value ^= TEMPERING_SHIFT_T(random_value) & TEMPERING_MASK_C;
    random_value ^= TEMPERING_SHIFT_L(random_value);
    return random_value;
    }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]