If you think about the (original) actor model, you'll find that it's defined by what you are not allowed to do. You are not allowed to share state. You are not allowed to use everything else but message passing for communication. You are not allowed to randomly change your state, because you shall change your state only in response to a received message. If you think about functional programming, you'll find the exact same mindset. Functional programming is all about immutability (read: "you are not allowed to change objects") and prohibition of side-effects (read: "you are not allowed to manipulate your arguments or some global state"). There are good reasons for such restrictions, but let's talk about Erlang first.
If you are familiar with Erlang, you've probably noticed that the word "actor" is not used at all. Neither in function names nor in its documentation. In fact, the dissertation of Joe Armstrong (the inventor of Erlang) doesn't even include the word "actor". Why? Erlang is always referred to as the reference implementation of the actor model, isn't it?
Well, Erlang is based on a simple observation. In the "old days", people started implementing operation systems able to execute multiple programs in parallel. Mostly because punched cards and batch processing don't make fun. But this led to trouble. Serious trouble. In a naive mutlitasking-OS, you are limited by the amount of freedom you (and other developers) have, because if you are free to write everywhere you want, you can easily corrupt someone else's program. That's why we need an MMU in our computer. It's impossible to write a program if you know that someone else might manipulate your state (memory) at any point in time. So, the OS organized running programs into processes that do not share state and the world was sane again. You can start a program ten times and each of it will have a unique state. Pipes were added to allow processes to communicate. A process also can create a new process using fork. But then, people wanted concurrent execution of a single program, e.g., to keep GUI's responsive while doing background work. Threads were added, because fork is expensive and communicating via a pipe is complicated ... and the concurrency nightmare begun.
Erlang had the simple idea to fix the real issue here: processes are too expensive in modern OS's and communication between processes is too complicated. Do you know why Erlang does not have a threading library? Because no language should! Programs are written by humans and humans cannot think concurrently. Erlang's VM provides lightweight processes with a simple way to exchange messages. And it abstracts away all the dirty details. Plus, message passing is network transparent.
Message passing is something people can imagine. It is something people can reason about. Abstraction is always the answer in computer science. And abstraction means to build a simple model of something inherently complicated. Small systems communicating via message passing is something we can reason about. We can build large systems by "plugging" small systems together and we are still able to handle it. But hundreds of thousands of objects floating around, sharing state and run in parallel is hard to comprehend. To quote Rich Hickey (have a look at this channel9 video if you don't know him): "Mutable stateful objects are the new spaghetti code".
By not sharing memory, some problems just disappear. Isolation is a restriction, but it allows you to focus on your problems and you can use all your creativity and skill to do this. Threads are broken by design. Some bright minds are trying to fix them, but I think threads are best avoided. Unfortunately, we cannot "undo" threads and we cannot go back in time to implement OS's more suitable to write concurrent software. However, we can treat "std::thread" (and friends) the way we treat "goto". It's in the language/STL, but you should not use it in production code. Well, maybe someone implements something that's useful and safe on top of it (libcppa for example).
But let's get back to actors. Karl Hewitt et al. published an article about isolated computational entities, "actors", in the year 1973 (btw. there's an interesting video featuring Hewitt on channel9). It's a theoretically point of view to answer the question "what is the minimum set of axioms we need to describe concurrency?" Erlang came from the opposite direction. They said "writing concurrent, fault-tolerant software in traditional programming models is extremely difficult, how can we provide a better model?" Interestingly enough, Erlang developers came up with a programming paradigm that's in fact based on the axioms of the actor model. So to speak, the wheel was invented twice. But Armstrong didn't came up with a remarkable name for the programming paradigm while Hewitt did.