• acockworkorange@mander.xyz
    link
    fedilink
    arrow-up
    39
    ·
    7 days ago

    In the industrial automation world and most of the IT industry, data is aligned to the nearest word. Depending on architecture, that’s usually either 16, 32, or 64 bits. And that’s the space a single Boolean takes.

    • ZILtoid1991@lemmy.world
      link
      fedilink
      arrow-up
      20
      ·
      7 days ago

      That’s why I primarily use booleans in return parameters, beyond that I’ll try to use bitfields. My game engine’s tilemap format uses a 32 bit struct, with 16 bit selecting the tile, 12 bit selecting the palette, and 4 bit used for various bitflags (horizontal and vertical mirroring, X-Y axis invert, and priority bit).

      • acockworkorange@mander.xyz
        link
        fedilink
        arrow-up
        30
        ·
        7 days ago

        Bit fields are a necessity in low level networking too.

        They’re incredibly useful, I wish more people made use of them.

        I remember I interned at a startup programming microcontrollers once and created a few bitfields to deal with something. Then the lead engineer went ahead and changed them to masked ints. Because. The most aggravating thing is that an int size isn’t consistent across platforms, so if they were ever to change platforms to a different word length, they’d be fucked as their code was full of platform specific shenanigans like that.

        /rant

    • bastion@feddit.nl
      link
      fedilink
      arrow-up
      3
      ·
      5 days ago

      The alignment of the language and the alignment of the coder must be similar on at least one metric, or the coder suffers a penalty to develop for each degree of difference from the language’s alignment. This is penalty stacks for each phase of the project.

      So, let’s say that the developer is a lawful good Rust zealot Paladin, but she’s developing in Python, a language she’s moderately familiar with. Since Python is neutral/good, she suffers a -1 penalty for the first phase, -2 for the second, -3 for the third, etc. This is because Rust (the Paladin’s native language) is lawful, and Python is neutral (one degree of difference from lawful), so she operates at a slight disadvantage. However, they are both “good”, so there’s no further penalty.

      The same penalty would occur if using C, which is lawful neutral - but the axis of order and chaos matches, and there is one degree of difference on the axis of good and evil.

      However, if that same developer were to code in Javascript (chaotic neutral), it would be at a -3 (-6, -9…) disadvantage, due to 2 and 1 degree of difference in alignment, respectively.

      Malbolge (chaotic evil), however, would be a -4 (-8, -12) plus an inherent -2 for poor toolchain availability.

      …hope this helps. have fun out there!

  • mavu@discuss.tchncs.de
    link
    fedilink
    arrow-up
    14
    ·
    6 days ago

    This reminds me that I actually once made a class to store bools packed in uint8 array to save bytes.

    Had forgotten that. I think i have to update the list of top 10 dumbest things i ever did.

      • Iron Lynx@lemmy.world
        link
        fedilink
        arrow-up
        7
        ·
        edit-2
        4 days ago

        ASCII was originally a 7-bit standard. If you type in ASCII on an 8-bit system, every leading bit is always 0.

        (Edited to specify context)

        At least ASCII is forward compatible with UTF-8

      • houseofleft@slrpnk.net
        link
        fedilink
        English
        arrow-up
        4
        ·
        6 days ago

        Ascii needs seven bits, but is almost always encoded as bytes, so every ascii letter has a throwaway bit.

          • anton@lemmy.blahaj.zone
            link
            fedilink
            arrow-up
            1
            ·
            4 days ago

            That boolean can indicate if it’s a fancy character, that way all ASCII characters are themselves but if the boolean is set it’s something else. We could take the other symbol from a page of codes to fit the users language.
            Or we could let true mean that the character is larger, allowing us to transform all of unicode to a format consisting of 8 bits parts.

        • FuckBigTech347@lemmygrad.ml
          link
          fedilink
          arrow-up
          1
          ·
          6 days ago

          Some old software does use 8-Bit ASCII for special/locale specific characters. Also there is this Unicode hack where the last bit is used to determine if the byte is part of a multi-byte sequence.

    • Anders429@programming.dev
      link
      fedilink
      arrow-up
      35
      ·
      7 days ago

      It would be slower to read the value if you had to also do bitwise operations to get the value.

      But you can also define your own bitfield types to store booleans packed together if you really need to. I would much rather that than have the compiler do it automatically for me.

    • timhh@programming.dev
      link
      fedilink
      arrow-up
      24
      ·
      7 days ago

      Well there are containers that store booleans in single bits (e.g. std::vector<bool> - which was famously a big mistake).

      But in the general case you don’t want that because it would be slower.

    • gamer@lemm.ee
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      7 days ago

      Consider what the disassembly would look like. There’s no fast way to do it.

      It’s also unnecessary since 8 bytes is a negligible amount in most cases. Serialization is the only real scenario where it matters. (Edit: and embedded)

      • Croquette@sh.itjust.works
        link
        fedilink
        arrow-up
        4
        ·
        6 days ago

        In embedded, if you are to the point that you need to optimize the bools to reduce the footprint, you fucked up sizing your mcu.

  • steeznson@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    6 days ago

    We need to be able to express 0 and 1 as integers so that functionality is just being overloaded to express another concept.

    Wait until the person who made this meme finds out about how many bits are being wasted on modern CPU architectures. 7 is the minimum possible wasted bits but it would be 31 on every modern computer (even 64b machines since they default to 32b ints).

  • ssfckdt@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    6
    ·
    6 days ago

    I swore I read that mysql dbs will store multiple bools in a row as bit maps in one byte. I can’t prove it though

    • excral@feddit.org
      link
      fedilink
      arrow-up
      11
      ·
      6 days ago

      In terms of memory usage it’s a waste. But in terms of performance you’re absolutely correct. It’s generally far more efficient to check is a word is 0 than to check if a single bit is zero.

    • Aux@feddit.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Usually the most effective way is to read and write the same amount of bits as the architecture of the CPU, so for 64 bit CPUs it’s 64 bits at once.

  • elucubra@sopuli.xyz
    link
    fedilink
    arrow-up
    4
    ·
    6 days ago

    Could a kind soul ELI5 this? Well, maybe ELI8. I did quite a bit of programming in the 90-00s as part of my job, although nowadays I’m more of a script kiddie.

    • superheitmann@programming.dev
      link
      fedilink
      arrow-up
      9
      ·
      6 days ago

      A Boolean is a true/false value. It can only be those two values and there be represented by a single bit (1 or 0).

      In most languages a Boolean variable occupies the space of a full byte (8 bit) even though only a single of those bits is needed for representing the Boolean.

      That’s mostly because computers can’t load a bit. They can only load bytes. Your memory is a single space where each byte has a numeric address. Starting from 0 and going to whatever amount of memory you have available. This is not really true because on most operating systems each process gets a virtual memory space but its true for many microcontrollers. You can load and address each f these bytes but it will always be a byte. That’s why booleans are stored as bytes because youd have to pack them with other data on the same address other wise and that’s getting complicated.

      Talking about getting complicated, in C++ a std::vector<bool> is specialized as a bit field. Each of the values in that vector only occupy a single bit and you can get a vector of size 8 in a single byte. This becomes problematic when you want to store references or pointers to one of the elements or when you’re working with them in a loop because the elements are not of type bool but some bool-reference type.

      • Aux@feddit.uk
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        And performance optimisation of a compiler for a 64 bit CPU will realign everything and each boolean will occupy 8 bytes instead.

    • feddup@feddit.uk
      link
      fedilink
      English
      arrow-up
      4
      ·
      6 days ago

      A boolean value only needs 1 bit (on or off) for true or false. However the smallest bit of addressable memory is a byte (8 bits) hence 7 are technically wasted.

      For low memory devices you could instead store 8 different Boolean values in one single byte by using bit masking instead

  • KindaABigDyl@programming.dev
    link
    fedilink
    arrow-up
    183
    ·
    8 days ago
    typedef struct {
        bool a: 1;
        bool b: 1;
        bool c: 1;
        bool d: 1;
        bool e: 1;
        bool f: 1;
        bool g: 1;
        bool h: 1;
    } __attribute__((__packed__)) not_if_you_have_enough_booleans_t;
    
    • xthexder@l.sw0.com
      link
      fedilink
      arrow-up
      41
      ·
      edit-2
      8 days ago

      Or just std::bitset<8> for C++. Bit fields are neat though, it can store weird stuff like a 3 bit integer, packed next to booleans

    • h4x0r@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      16
      ·
      7 days ago

      This was gonna be my response to OP so I’ll offer an alternative approach instead:

      typedef enum flags_e : unsigned char {
        F_1 = (1 << 0),
        F_2 = (1 << 1),
        F_3 = (1 << 2),
        F_4 = (1 << 3),
        F_5 = (1 << 4),
        F_6 = (1 << 5),
        F_7 = (1 << 6),
        F_8 = (1 << 7),
      } Flags;
      
      int main(void) {
        Flags f = F_1 | F_3 | F_5;
        if (f & F_1 && f & F_3) {
          // do F_1 and F_3 stuff
        }
      }
      
      • anotherandrew@lemmy.mixdown.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        7 days ago

        Why not if (f & (F_1 | F_3)) {? I use this all the time in embedded code.

        edit: never mind; you’re checking for both flags. I’d probably use (f & (F_1 | F_3)) == (F_1 | F_3) but that’s not much different than what you wrote.

    • mmddmm@lemm.ee
      link
      fedilink
      arrow-up
      159
      ·
      8 days ago

      And compiler. And hardware architecture. And optimization flags.

      As usual, it’s some developer that knows little enough to think the walls they see around enclose the entire world.

      • timhh@programming.dev
        link
        fedilink
        arrow-up
        4
        ·
        7 days ago

        I don’t think so. Apart from dynamically typed languages which need to store the type with the value, it’s always 1 byte, and that doesn’t depend on architecture (excluding ancient or exotic architectures) or optimisation flags.

        Which language/architecture/flags would not store a bool in 1 byte?

        • brian@programming.dev
          link
          fedilink
          arrow-up
          1
          ·
          7 days ago

          things that store it as word size for alignment purposes (most common afaik), things that pack multiple books into one byte (normally only things like bool sequences/structs), etc

          • timhh@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            3 days ago

            things that store it as word size for alignment purposes

            Nope. bools only need to be naturally aligned, so 1 byte.

            If you do

            struct SomeBools {
              bool a;
              bool b;
              bool c;
              bool d;
            };
            

            its 4 bytes.

            • brian@programming.dev
              link
              fedilink
              arrow-up
              1
              ·
              3 days ago

              sure, but if you have a single bool in a stack frame it’s probably going to be more than a byte. on the heap definitely more than a byte

              • timhh@programming.dev
                link
                fedilink
                arrow-up
                1
                ·
                1 day ago

                but if you have a single bool in a stack frame it’s probably going to be more than a byte.

                Nope. - if you can’t read RISC-V assembly, look at these lines

                        sb      a5,-17(s0)
                ...
                        sb      a5,-18(s0)
                ...
                        sb      a5,-19(s0)
                ...
                

                That is it storing the bools in single bytes. Also I only used RISC-V because I’m way more familiar with it than x86, but it will do the same thing.

                on the heap definitely more than a byte

                Nope, you can happily malloc(1) and store a bool in it, or malloc(4) and store 4 bools in it. A bool is 1 byte. Consider this a TIL moment.

                • brian@programming.dev
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  18 hours ago

                  c++ guarantees that calls to malloc are aligned https://en.cppreference.com/w/cpp/memory/c/malloc .

                  you can call malloc(1) ofc, but calling malloc_usable_size(malloc(1)) is giving me 24, so it at least allocated 24 bytes for my 1, plus any tracking overhead

                  yeah, as I said, in a stack frame. not surprised a compiler packed them into single bytes in the same frame (but I wouldn’t be that surprised the other way either), but the system v abi guarantees at least 4 byte alignment of a stack frame on entering a fn, so if you stored a single bool it’ll get 3+ extra bytes added on the next fn call.

                  computers align things. you normally don’t have to think about it. Consider this a TIL moment.

        • mmddmm@lemm.ee
          link
          fedilink
          arrow-up
          1
          ·
          7 days ago

          Apart from dynamically typed languages which need to store the type with the value

          You know that depending on what your code does, the same C that people are talking upthread doesn’t even need to allocate memory to store a variable, right?

            • timhh@programming.dev
              link
              fedilink
              arrow-up
              2
              ·
              3 days ago

              I think he’s talking about if a variable only exists in registers. In which case it is the size of a register. But that’s true of everything that gets put in registers. You wouldn’t say uint16_t is word-sized because at some point it gets put into a word-sized register. That’s dumb.