C++并发编程1——让我们开始管理线程

全体动物皆属平等，
但有些动物比其他动物更加平等。

乔治·奥威尔动物农庄

从本文开始，我们会学习到线程的基本使用方式，本文是多线程开发的基础。

创建线程

上文中的经典hello world例子使用了最基本的线程创建方法，也是我们最常用的方法。std::thread对象的构造参数需要为Callable Object，可以是函数、函数对象、类的成员函数或者是Lambda表达式。接下来我们会给出这四种创建线程的方法。

以函数作为参数

上文中的Hello C++ Concurrency程序，就是最好的以函数为参数构造std::thread的例子，这里不再赘述。

以函数对象作为参数

函数对象利用了C++类的调用重载运算符，实现了该重载运算符的类对象可以当成函数一样进行调用。如下例：

#include <iostream>
#include <thread>

class hello
{
public:
    hello(){ }
    void operator()()const
    {
        std::cout << "Hello world" << std::endl;
    }
};

int main()
{
    hello h;
    std::thread t1(h);
    t1.join();
    return 0;
}

这里需要注意一点：如果需要直接传递临时的函数对象，C++编译器会将std::thread对象构造解析为函数声明：

std::thread t2(hello()); /* error, compile as std::thread t2(hello(*)()); */
std::thread t3((hello())); /* ok */
std::thread t4{ hello() }; /* ok */
t2.join();   /* compile error: expression must have class type */
t3.join();   /* ok */
t4.join();   /* ok */

以类的成员函数作为参数

为了作为std::thread的构造参数，类的成员函数名必须唯一，在下例中，如果world1()和world2()函数名都是world，则编译出错，这是因为名字解析发生在参数匹配之前。

#include <iostream>
#include <thread>
#include <string>

class hello
{
public:
    hello(){ }
    void world1()
    {
        std::cout << "Hello world" << std::endl;
    }
    void world2(std::string text)
    {
        std::cout << "Hello world, " << text << std::endl;
    }
};

int main()
{
    hello h;
    std::thread t1(&hello::world1, &h);
    std::thread t2(&hello::world2, &h, "lee");
    t1.join();
    t2.join();
    return 0;
}

以lambda对象作为参数

#include <iostream>
#include <thread>
#include <string>

int main()
{
    std::thread t([](std::string text){
        std::cout << "hello world, " << text << std::endl;
    }, "lee");
    t.join();
    return 0;
}

创建线程对象时需要切记，使用一个能访问局部变量的函数去创建线程是一个糟糕的注意。

等待线程

join()等待线程完成，只能对一个线程对象调用一次join()，因为调用join()的行为，负责清理线程相关内容，如果再次调用，会出现Runtime Error。

std::thread t([](){
    std::cout << "hello world" << std::endl;
});
t.join(); /* ok */
t.join(); /* runtime error */
if(t.joinable())
{
    t.join(); /* ok */
}

对join()的调用，需要选择合适的调用时机。如果线程运行之后父线程产生异常，在join()调用之前抛出，就意味着这次调用会被跳过。解决办法是，在无异常的情况下使用join()——在异常处理过程中调用join()。

#include <iostream>
#include <thread>
#include <string>

int main()
{
    std::thread t([](std::string text){
        std::cout << "hello world, " << text << std::endl;
    }, "lee");
    try
    {
        throw std::exception("test");
    }
    catch (std::exception e)
    {
        std::cout << e.what() << std::endl;
        t.join();
    }
    if (t.joinable())
    {
        t.join();
    }
    return 0;
}

上面并非解决这个问题的根本方法，如果其他问题导致程序提前退出，上面方案无解，最好的方法是所谓的RAII。

#include <iostream>
#include <thread>
#include <string>

class thread_guard
{
public:
    explicit thread_guard(std::thread &_t)
        : t(std::move(_t))
    {
        if(!t.joinable())
            throw std::logic_error("No Thread");
    }

    ~thread_guard()
    {
        if (t.joinable())
        {
            t.join();
        }
    }
    thread_guard(thread_guard const&) = delete;
    thread_guard& operator=(thread_guard const &) = delete;
private:
    std::thread t;
};

void func()
{
    thread_guard guard(std::thread([](std::string text){
        std::cout << "hello world, " << text << std::endl;
    }, "lee"));
    try
    {
        throw std::exception("test");
    }
    catch (...)
    {
        throw;
    }
}

int main()
{
    try
    {
        func();
    }
    catch (std::exception e)
    {
        std::cout << e.what() << std::endl;
    }
    return 0;
}

分离线程

detach()将子线程和父线程分离。分离线程后，可以避免异常安全问题，即使线程仍在后台运行，分离操作也能确保std::terminate在std::thread对象销毁时被调用。

通常称分离线程为守护线程（deamon threads），这种线程的特点就是长时间运行；线程的生命周期可能会从某一个应用起始到结束，可能会在后台监视文件系统，还有可能对缓存进行清理，亦或对数据结构进行优化。

#include <iostream>
#include <thread>
#include <string>
#include <assert.h>

int main()
{
    std::thread t([](std::string text){
        std::cout << "hello world, " << text << std::endl;
    }, "lee");

    if (t.joinable())
    {
        t.detach();
    }
    assert(!t.joinable());

    return 0;
}

上面的代码中使用到了joinable()函数，不能对没有执行线程的std::thread对象使用detach()，必须要使用joinable()函数来判断是否可以加入或分离。

线程传参

正常的线程传参是很简单的，但是需要记住下面一点：**默认情况下，即使我们线程函数的参数是引用类型，参数会先被拷贝到线程空间，然后被线程执行体访问。**上面的线程空间为线程能够访问的内部内存。我们来看下面的例子：

1 2	void f(int i,std::string const& s); std::thread t(f,3,”hello”);

即使f的第二个参数是引用类型，字符串字面值"hello"还是被拷贝到线程t空间内，然后被转换为std::string类型。在上面这种情况下不会出错，但是在下面这种参数为指向自动变量的指针的情况下就很容易出错。

void f(int i,std::string const& s);
void oops(int some_param)
{
    char buffer[1024];
    sprintf(buffer, "%i",some_param);
    std::thread t(f,3,buffer);
    t.detach();
}

在这种情况下，指针变量buffer将会被拷贝到线程t空间内，这个时候很可能函数oops结束了，buffer还没有被转换为std::string，这个时候就会导致未定义行为。解决方案如下：

void f(int i,std::string const& s);
void not_oops(int some_param)
{
    char buffer[1024];
    sprintf(buffer,"%i",some_param);
    std::thread t(f,3,std::string(buffer));
    t.detach();
}

由于上面所说，进程传参时，参数都会被进行一次拷贝，所以即使我们将进程函数参数设为引用，也只是对这份拷贝的引用。我们对参数的操作并不会改变其传参之前的值。看下面例子：

void update_data_for_widget(widget_id w,widget_data& data);
void oops_again(widget_id w)
{
    widget_data data;
    std::thread t(update_data_for_widget,w,data);
    display_status();
    t.join();
    process_widget_data(data);
}

线程t执行完成之后，data的值并不会有所改变，process_widget_data(data)函数处理的就是一开始的值。我们需要显示的声明引用传参，使用std::ref包裹需要被引用传递的参数即可解决上面问题：

void update_data_for_widget(widget_id w,widget_data& data);
void oops_again(widget_id w)
{
    widget_data data;
    std::thread t(update_data_for_widget,w,std::ref(data));
    display_status();
    t.join();
    process_widget_data(data);
}

对于可以移动不可拷贝的参数，譬如std::unqiue_ptr对象，如果源对象是临时的，移动操作是自动执行的；如果源对象是命名变量，必须显式调用std::move函数。

void process_big_object(std::unique_ptr<big_object>);
std::unique_ptr<big_object> p(new big_object);
p->prepare_data(42);
std::thread t(process_big_object,std::move(p));

转移线程所有权

std::thread是可移动的，不可拷贝。在std::thread对象之间转移线程所有权使用sd::move函数。

void some_function();
void some_other_function();
std::thread t1(some_function);           /*  1  */
std::thread t2=std::move(t1);            /*  2  */
t1=std::thread(some_other_function);     /*  3 临时对象会隐式调用std::move转移线程所有权 */
std::thread t3;                          /*  4  */
t3=std::move(t2);                        /*  5  */
t1=std::move(t3);                        /*  6 赋值操作将使程序崩溃  */
t1.detach();
t1=std::move(t3);                        /*  7 ok  */

这里需要注意的是临时对象会隐式调用std::move转移线程所有权，所以*t1=std::thread(some_other_function);*不需要显示调用std::move。如果需要析构thread对象，必须等待join()返回或者是detach()，同样，如果需要转移线程所有权，必须要等待接受线程对象的执行函数完成，不能通过赋一个新值给std::thread对象的方式来"丢弃"一个线程。第6点中，t1仍然和some_other_function联系再一次，所以不能直接转交t3的所有权给t1。

std::thread支持移动，就意味着线程的所有权可以在函数外进行转移。

std::thread f()
{
  void some_function();
  return std::thread(some_function);
}

std::thread g()
{
  void some_other_function(int);
  std::thread t(some_other_function,42);
  return t;
}

当所有权可以在函数内部传递，就允许std::thread实例可作为参数进行传递。

void f(std::thread t);
void g()
{
  void some_function();
  f(std::thread(some_function));
  std::thread t(some_function);
  f(std::move(t));
}

利用这个特性，我们可以实现线程对象的RAII封装。

class thread_guard
{
public:
    explicit thread_guard(std::thread &_t)
        : t(std::move(_t))
    {
        if (!t.joinable())
            throw std::logic_error("No Thread");
    }

    ~thread_guard()
    {
        if (t.joinable())
        {
            t.join();
        }
    }
    thread_guard(thread_guard const&) = delete;
    thread_guard& operator=(thread_guard const &) = delete;
private:
    std::thread t;
};
struct func;
void f() {
    int some_local_state;
    scoped_thread t(std::thread(func(some_local_state)));
    do_something_in_current_thread();
}

利用线程可以转移的特性我们可以用容器来集中管理线程，看下面代码：

void do_work(unsigned id);
void f() {
    std::vector<std::thread> threads;
    for(unsigned i=0;i<20;++i)
    {
        threads.push_back(std::thread(do_work,i));
    }
    std::for_each(threads.begin(),threads.end(),
                  std::mem_fn(&std::thread::join));
}

线程相关

线程数量

std:🧵:hardware_concurrency()函数返回一个程序中能够同时并发的线程数量，在多核系统中，其一般是核心数量。但是这个函数仅仅是一个提示，当系统信息无法获取时，函数会返回0。看下面并行处理的例子：

识别线程

线程标识类型是std:🧵:id，可以通过两种方式进行检索。

通过调用std::thread对象的成员函数get_id()来直接获取。
当前线程中调用std::this_thread::get_id()也可以获得线程标识。

上面的方案和线程sleep很相似，使用上面一样的格式，get_id()函数替换成sleep()函数即可。
std:🧵:id对象可以自由的拷贝和对比：

如果两个对象的std:🧵:id相等，那它们就是同一个线程，或者都“没有线程”。
如果不等，那么就代表了两个不同线程，或者一个有线程，另一没有。

std:🧵:id实例常用作检测特定线程是否需要进行一些操作，这常常用在某些线程需要执行特殊操作的场景，我们必须先要找出这些线程。