[book Standardized Floating-Point typedefs for C and C++ [quickbook 1.7] [copyright 2014 Christopher Kormanyos, John Maddock, Paul A. Bristow] [license Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at [@http://www.boost.org/LICENSE_1_0.txt]) ] [authors [Kormanyos, Christopher], [Maddock, John], [Bristow, Paul A.] ] [last-revision $Date$] [/version 1.8.3] ] [template tr1[] [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf Technical Report on C++ Library Extensions]] [template C99[] [@http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf C99 Standard ISO/IEC 9899:1999]] [def __gsl [@http://www.gnu.org/software/gsl/ GSL-1.9]] [def __glibc [@http://www.gnu.org/software/libc/ GNU C Lib]] [def __hpc [@http://docs.hp.com/en/B9106-90010/index.html HP-UX C Library]] [def __cephes [@http://www.netlib.org/cephes/ Cephes]] [def __NTL [@http://www.shoup.net/ntl/ NTL A Library for doing Number Theory]] [def __NTL_RR [@http://shoup.net/ntl/doc/RR.txt NTL::RR]] [def __NTL_quad_float [@http://shoup.net/ntl/doc/quad_float.txt NTL::quad_float]] [def __MPFR [@http://www.mpfr.org/ GNU MPFR library]] [def __GMP [@http://gmplib.org/ GNU Multiple Precision Arithmetic Library]] [def __multiprecision [@http://www.boost.org/doc/libs/1_53_0_beta1/libs/multiprecision/doc/html/index.html Boost.Multiprecision]] [def __cpp_dec_float [@http://www.boost.org/doc/libs/1_53_0_beta1/libs/multiprecision/doc/html/boost_multiprecision/tut/floats/cpp_dec_float.html cpp_dec_float]] [def __R [@http://www.r-project.org/ The R Project for Statistical Computing]] [def __godfrey [link godfrey Godfrey]] [def __pugh [link pugh Pugh]] [def __NaN [@http://en.wikipedia.org/wiki/NaN NaN]] [def __errno [@http://en.wikipedia.org/wiki/Errno `::errno`]] [def __Mathworld [@http://mathworld.wolfram.com Wolfram MathWorld]] [def __Mathematica [@http://www.wolfram.com/products/mathematica/index.html Wolfram Mathematica]] [def __WolframAlpha [@http://www.wolframalpha.com/ Wolfram Alpha]] [def __TOMS748 [@http://portal.acm.org/citation.cfm?id=210111 TOMS Algorithm 748: enclosing zeros of continuous functions]] [def __TOMS910 [@http://portal.acm.org/citation.cfm?id=1916469 TOMS Algorithm 910: A Portable C++ Multiple-Precision System for Special-Function Calculations]] [def __why_complements [link why_complements why complements?]] [def __complements [link math_toolkit.stat_tut.overview.complements complements]] [def __performance [link perf performance]] [def __building [link math_toolkit.building building libraries]] [def __e_float [@http://calgo.acm.org/910.zip e_float (TOMS Algorithm 910)]] [def __Abramowitz_Stegun M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, NBS (1964)] [def __DMLF [@http://dlmf.nist.gov/ NIST Digital Library of Mathematical Functions]] [def __IEEE754 [@http://en.wikipedia.org/wiki/IEEE_floating_point IEEE_floating_point]] [def __N3626 [@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3626.pdf N3626]] [def __N1703 [@http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1703.pdf N1703]] [/ Some composite templates] [template super[x]''''''[x]''''''] [template sub[x]''''''[x]''''''] [template floor[x]'''⌊'''[x]'''⌋'''] [template floorlr[x][lfloor][x][rfloor]] [template ceil[x] '''⌈'''[x]'''⌉'''] [/template header_file[file] [@../../../../[file] [file]]] [note A printer-friendly PDF version of this manual is also available.] [section:overview Overview] The header `` provides optional standardized floating-point `typedef`s having specified widths. These are useful for writing portable code because they should behave identically on all platforms. All `typedef`s are in `namespace boost`. The `typedef`s include `float16_t, float32_t, float64_t, float128_t`, their corresponding least and fast types, and the corresponding maximum-width type. The `typedef`s are based on underlying built-in types such as `float`, `double`, or `long double`, or based on other compiler-specific non-standardized types such as `__float128`. The underlying types of these typedef's must conform with the corresponding specifications of binary16, binary32, binary64, and binary128 in __IEEE754 floating-point format [@http://en.wikipedia.org/wiki/IEEE_floating_point]. The typedef's are based on __N3626 proposed for a new C++14 standard header `` and __N1703 proposed for a new C language standard header ``. The 128-bit floating-point type, of great interest in scientific and numeric programming, is not required in the boost header, and may not be supplied for all platforms/compilers, because compiler support for a 128-bit floating-point type is not mandated by either the C standard or the C++ standard. The following code uses `` in combination with `` to compute a simplified version of the Jahnke-Emden-Lambda function. Here, we use a floating-point type with exactly 64 bits (i.e., `float64_t`). If we were to use, for instance, built-in `double`, then there would be no guarantee that the code would behave identically on all platforms. With `float64_t` from ``, however, this is very likely. Using `float64_t`, we know that this code is portable and uses a floating-point type with approximately 15 decimal digits of precision. #include #include #include boost::float64_t jahnke_emden_lambda(boost::float64_t v, boost::float64_t x) { const boost::float64_t gamma_v_plus_one = boost::math::tgamma(v + 1); const boost::float64_t x_half_pow_v = std::pow(x / 2, v); return gamma_v_plus_one * boost::math::cyl_bessel_j(x, v) / x_half_pow_v; } See `cstdfloat_test.cpp` for a more detailed test program. [endsect] [/section:overview Overview] [section:rationale Rationale] The implementation of `` is designed to utilize ``, defined in the 1989 C standard. The preprocessor is used to query certain preprocessor definitions in `` such as FLT_MAX, DBL_MAX, etc. Based on the results of these queries, an attempt is made to automatically detect the presence of built-in floating-point types having specified widths. An unequivocal test regarding conformance with __IEEE754 (IEC599) based on [@ http://en.cppreference.com/w/cpp/types/numeric_limits/is_iec559 `std::numeric_limits<>::is_iec559`] is performed with `BOOST_STATIC_ASSERT`. The header `` makes the standardized floating-point `typedef`s safely available in `namespace boost` without placing any names in `namespace std`. The intention is to complement rather than compete with a potential future C++ Standard Library that may contain these `typedef`s. Should some future C++ standard include `` and ``, then `` will continue to function, but will become redundant and may be safely deprecated. Because `` is a boost header, its name conforms to the boost header naming conventions, not the C++ Standard Library header naming conventions. [note [*cannot synthesize or create a `typedef` if the underlying type is not provided by the compiler]. For example, if a compiler does not have an underlying floating-point type with 128 bits (highly sought-after in scientific and numeric programming), then `float128_t` and its corresponding least and fast types are not provided by `.] [warning As an implementation artifact, certain C macro names from `` may possibly be visible to users of ``. Don't rely on using these macros; they are not part of any Boost-specified interface. Use `std::numeric_limits<>` for floating-point ranges, etc. instead.] [endsect] [/section:rationale Rationale] [section:exact_typdefs Exact-Width Floating-Point `typedef`s] The `typedef float#_t`, with # replaced by the width, designates a floating-point type of exactly # bits. For example `float32_t` denotes a single-precision floating-point type with approximately 7 decimal digits of precision (equivalent to binary32 in __IEEE754). Floating-point types specified in C and C++ are allowed to have implementation-specific widths and formats. However, if a platform supports underlying floating-point types (conformant with __IEEE754) with widths of 16, 32, 64, 128 bits, or any combination thereof, then `` does provide the corresponding `typedef`s `float16_t, float32_t, float64_t, float128_t,` their corresponding least and fast types, and the corresponding maximum-width type The absence of `float128_t` is indicated by the macro `BOOST_NO_FLOAT128_T`. [endsect] [/section:exact_typdefs Exact-Width Floating-Point `typedef`s] [section:fastest_typdefs Fastest minimum-width floating-point `typedef`s] The `typedef float_least#_t`, with # replaced by the width, designates a floating-point type with a [*width of at least # bits], such that no floating-point type with lesser size has at least the specified width. Thus, `float_least32_t` denotes the smallest floating-point type with a width of at least 32 bits. Minimum-width floating-point types are provided for all existing exact-width floating-point types on a given platform. For example, if a platfrom supports `float32_t` and `float64_t`, then `float_least32_t` and `float_least64_t` will also be supported, etc. [endsect] [/section:fastest_typdefs Fastest minimum-width floating-point `typedef`s] [section:fastest_typdefs Fastest minimum-width floating-point `typedef`s] The typedef `float_fast#_t`, with # replaced by the width, designates the [*fastest] floating-point type with a width of at least # bits. There is no absolute guarantee that these types are the fastest for all purposes. In any case, however, they satisfy the precision and width requirements. Fastest minimum-width floating-point types are provided for all existing exact-width floating-point types on a given platform. For example, if a platform supports `float32_t` and `float64_t`, then `float_fast32_t` and `float_fast64_t` will also be supported, etc. [endsect] [/section:fastest_typdefs Fastest minimum-width floating-point `typedef`s] [section:greatest_typdefs Greatest-width floating-point typedef] The `typedef floatmax_t` designates a floating-point type capable of representing any value of any floating-point type in a given platform. The greatest-width typedef is provided for all platforms. [endsect] [/section:greatest_typdefs Greatest-width floating-point typedef] [section:macros Floating-Point Constant Macros] All macros of the type `BOOST_FLOAT16_C, BOOST_FLOAT32_C, BOOST_FLOAT64_C, BOOST_FLOAT128_C, BOOST_FLOATMAX_C` are always defined after inclusion of ``. These allow floating-point constants of at least the specified width to be declared. For example: #include // Declare Pythagoras' constant with approximately 7 decimal digits of precision. static const boost::float32_t pi = BOOST_FLOAT32_C(3.1415926536); // Declare the Euler-gamma constant with approximately 34 decimal digits of precision. static const boost::float128_t euler = BOOST_FLOAT128_C(0.57721566490153286060651209008240243104216); [endsect] [/section:macros Floating-Point Constant Macros] [/ cstdfloat.qbk Copyright 2014 Christopher Kormanyos, John Maddock and Paul A. Bristow. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt). ]