Rounding floats from 64bit to 32bit (double to single) with 0.5 rule

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Rounding floats from 64bit to 32bit (double to single) with 0.5 rule

From:	hale812
Subject:	Rounding floats from 64bit to 32bit (double to single) with 0.5 rule
Date:	Sun, 25 Dec 2016 23:25:48 -0800 (PST)

Seems like single() function truncates IEEE 754 double float by simply
omitting irrelevant bits.

This however becomes a problem of error accumulation, when converting data
for 32bit DSP with a long path of computation.

For better results, the number should be rounded to Sgn1Exp8/Sig23 in binary
representation before truncating.

Is there a tool for Octave for rounded conversion to Single; or just binary
rounding(while maintaining irrelevant bits as zeroes in Double numbers) ?



--
View this message in context: 
http://octave.1599824.n4.nabble.com/Rounding-floats-from-64bit-to-32bit-double-to-single-with-0-5-rule-tp4681146.html
Sent from the Octave - General mailing list archive at Nabble.com.

[Prev in Thread]

Current Thread

[Next in Thread]

Rounding floats from 64bit to 32bit (double to single) with 0.5 rule, hale812 <=
- Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule, Sergei Steshenko, 2016/12/26
  - Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule, hale812, 2016/12/26
    - Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule, hale812, 2016/12/27

Prev by Date: How to force single precision (32bit) math ? (Octave crash on mixed 32/64bit math)
Next by Date: what a nice surprise
Previous by thread: How to force single precision (32bit) math ? (Octave crash on mixed 32/64bit math)
Next by thread: Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule
Index(es):
- Date
- Thread