November 24, 2011

Signed Integers Considered Stupid (Like This Title)

Unrelated note: If you title your article "[x] considered harmful", you are a horrible person with no originality. Stop doing it.

Signed integers have always bugged me. I've seen quite a bit of signed integer overuse in C#, but it is most egregious when dealing with C/C++ libraries that, for some reason, insist on using for(int i = 0; i < 5; ++i). Why would you ever write that? i cannot possibly be negative and for that matter shouldn't be negative, ever. Use for(unsigned int i = 0; i < 5; ++i), for crying out loud.

But really, that's not a fair example. You don't really lose anything using an integer for the i value there because its range isn't large enough. The places where this become stupid are things like using an integer for height and width, or returning a signed integer count. Why on earth would you want to return a negative count? If the count fails, return an unsigned -1, which is just the maximum possible value for your chosen unsigned integral type. Of course, certain people seem to think this is a bad idea because then you will return the largest positive number possible. What if they interpret that as a valid count and try to allocate 4 gigs of memory? Well gee, I don't know, what happens when you try to allocate -1 bytes of memory? In both cases, something is going to explode, and in both cases, its because the person using your code is an idiot. Neither way is more safe than the other. In fact, signed integers cause far more problems then they solve.

One of the most painfully obvious issues here is that virtually every single architecture in the world uses the two's complement representation of signed integers. When you are using two's complement on an 8-bit signed integer type (a char in C++), the largest positive value is 127, and the largest negative value is -128. That means a signed integer can represent a negative number so large it cannot be represented as a positive number. What happens when you do (char)abs(-128)? It tries to return 128, which overflows back to... -128. This is the cause of a host of security problems, and what's hilarious is that a lot of people try to use this to fuel their argument that you should use C# or Java or Haskell or some other esoteric language that makes them feel smart. The fact is, any language with fixed size integers has this problem. That means C# has it, Java has it, most languages have it to some degree. This bug doesn't mean you should stop using C++, it means you need to stop using signed integers in places they don't belong. Observe the following code:
  if (*p == '*')
  {
    ++p;
    total_width += abs (va_arg (ap, int));
  }
This is retarded. Why on earth are you interpreting an argument as a signed integer only to then immediately call abs() on it? So a brain damaged programmer can throw in negative values and not blow things up? If it can only possibly be valid when it is a positive number, interpret it as a unsigned int. Even if someone tries putting in a negative number, they will serve only to make the total_width abnormally large, instead of potentially putting in -128, causing abs() to return -128 and creating a total_width that is far too small, causing a buffer overflow and hacking into your program. And don't go declaring total_width as a signed integer either, because that's just stupid. Using an unsigned integer here closes a potential security hole and makes it even harder for a dumb programmer to screw things up1.

I can only attribute the vast overuse of int to programmer laziness. unsigned int is just too long to write. Of course, that's what typedef's are for, so that isn't an excuse, so maybe they're worried a programmer won't understand how to put a -1 into an unsigned int? Even if they didn't, you could still cast the int to an unsigned int to serve the same purpose and close the security hole. I am simply at a loss as to why I see int's all over code that could never possibly be negative. If it could never possibly be negative, you are therefore assuming that it won't be negative, so it's a much better idea to just make it impossible for it to be negative instead of giving hackers 200 possible ways to break your program.


1 There's actually another error here in that total_width can overflow even when unsigned, and there is no check for that, but that's beyond the scope of this article.

November 7, 2011

Why Kids Hate Math

They're teaching it wrong.

And I don't just mean teaching the concepts incorrectly (although they do plenty of that), I mean their teaching priorities are completely backwards. Set Theory is really fun. Basic Set Theory can be taught to someone without them needing to know how to add or subtract. We teach kids Venn Diagrams but never teach them all the fun operators that go with them? Why not? You say they won't understand? Bullshit. If we can teach third graders binary, we can teach them set theory. We take forever to get around to teaching algebra to kids, because its considered difficult. If something is a difficult conceptual leap, then you don't want to delay it, you want to introduce the concepts as early as possible. I say start teaching kids algebra once they know basic arithmetic. They don't need to know how to do crazy weird stuff like x * x = x² (they don't even know what ² means), but you can still introduce them to the idea of representing an unknown value with x. Then you can teach them exponentiation and logs and all those other operators first in the context of numbers, and then in the context of unknown variables. Then algebra isn't some scary thing that makes all those people who don't understand math give up, its something you simply grow up with.

In a similar manner, what the hell is with all those trig identities? Nobody memorizes those things! You memorize like, 2 or 3 of them, and almost only ever use sin² + cos² = 1. In a similar fashion, nobody ever uses integral trig identities because if you are using them you should have converted your coordinate system to polar coordinates, and if you can't do that then you can just look them up for crying out loud. Factoring and completing the square can be useful, but forcing students to do these problems over and over when they almost never actually show up in anything other than spoon-fed equations is insane.

Partial Fractions, on the other hand, are awesome and fun and why on earth are they only taught in intermediate calculus?! Kids are ALWAYS trying to pull apart fractions like that, and we always tell them to not do it - why not just teach them the right way to do it? By the time they finally got around to teaching me partial fractions, I was thinking that it would be some horrifically difficult, painful, complex process. It isn't. You just have to follow a few rules and then 0 out some functions. How can that possibly be harder than learning the concept of differentiation? And its useful too!

Lets say we want to teach someone basic calculus. How much do they need to know? They need to know addition, subtraction, division, multiplication, fractions, exponentiation, roots, algebra, limits, and derivatives. You could teach someone calculus without them knowing what sine and cosine even are. You could probably argue that, with proper teaching, calculus would be about as hard, or maybe a little harder, than trigonometry. Trigonometry, by the way, has an inordinate amount of time spent on it. Just tell kids how right triangles work, sine/cosine/tangent, SOHCAHTOA, a few identities, and you're good. You don't need to know scalene and isosceles triangles. Why do we even have special names for them? Who gives a shit if a triangle has sides of the same length? Either its a right triangle and its useful or its not a right triangle and you have to do some crazy sin law shit that usually means your algorithm is just wrong and so the only time you ever actually need to use it you can just look up the formula because it is a obtuse edge case that almost never comes up.

Think about that. We're grading kids by asking them to solve edge cases that never come up in reality and grading how well they are in math based off of that. And then we're confused when they complain about math having no practical application? Well duh. The sheer amount of time spent on useless topics is staggering. Calculus should be taught to high school freshman. Differential equations and complex analysis go to the seniors, and by the time you get into college you're looking at combinatorics and vector analysis, not basic calculus.

I have already seen some heavily flawed arguments against this. Some people say that people aren't interested in math, so this will never work. Since I'm saying that teaching kids advanced concepts early on will make them interested in math, this is a circular argument and invalid. Other people claim that the kids will never understand because of some bullshit about needing logical constructs, which just doesn't make sense because you should still introduce the concepts. Introducing a concept early on and having the student be confused about it is a good thing because it means they'll try to work it out over time. The more time you give them, the more likely it will click. Besides, most students aren't understanding algebra with the current system anyway, so I fail to see the point of that argument. It's not working now so don't try to change it or you'll make it worse? That's just pathetic.

TL;DR: Stop teaching kids stupid, pointless math they won't need and maybe they won't rightfully conclude that what they are being taught is useless.