use asan to analysis coredump

2023/06/30 cpp cpp coredump asan 共 22325 字,约 64 分钟
后端编程指北

Background

2023 线上实例经常性core online services have been core severl times

Coredump analyze

  • core 1: image

core occur in pb destructor when UpdateTable executes。

在pb析构函数里面产生了core

this use brpc double buffer data,deep copy, thread safe。

It seems that the memory has been corrupted.

使用brpc 双bufeer , 深拷贝,不会存在问题,看起来内存被写坏,出现异常的core

  • 另一个core:

image

running_version_map_ 受到 schedule_set_mutex_ 保护,线程安全。pb都是深拷贝出去的不会存在问题。

running_version_map_ is protected by schedule_set_mutex_ and is thread-safe. Protobufs are deep-copied, so there should be no issues.

Reproduce

相似性:其他的core也都类似,pb析构时导致core。都像是内存被破坏,导致core,大致问题在schedule调度过程中。

Similarity: Other cores are also similar in that they cause a core dump when pb is destructed. It seems like the memory is being corrupted, leading to the core dump. The main issue seems to be occurring during the schedule scheduling process.

调试信息缺失:缺少更多调试信息,无法查看详细栈变量。

Debugging information Lacking: There is a lack of additional debugging information, making it impossible to view detailed stack variables.

难复现:多次手动单个触发table更新,未出现对应core。构建多table同时更新场景,大量压,也很难复现。。

Difficulty in reproduction: It appears to be challenging to reproduce the issue as manually triggering individual table updates multiple times did not result in the corresponding core dump. Even in scenarios with multiple tables being updated simultaneously under heavy load, it remains difficult to reproduce the problem.

用ASAN跑一下看看

Running the code with ASAN (AddressSanitizer) to see if it helps identify any memory-related issues.

ASAN Analysis

参考ASAN 用法

在CMakeLists.txt添加编译选项, 先不使用recover机制

add ASAN compilation options in CMakeLists.txt, without  recovery mechanism,

if(ENABLE_ASAN)
    add_compile_options(-fsanitize=address -fno-stack-protector -fno-omit-frame-pointer -fno-var-tracking -g)
    add_link_options(-fsanitize=address -g )
endif()

自己mock一个HeartBeat线程,创建大量的UpdateTable任务,让worker 执行。

mock a HeartBeat thread and create many UpdateTable tasks。makes worker to execute

执行 ./bin/pressure_main 。

execute  ./bin/pressure_main 。

很快便出现ERROR

ERROR occurred quickly.

  • 具体检测到的信息

detail detected info

I0704 17:44:33.150276 871434 /root/platfom/worker/source/scheduler/scheduler.cc:264] Table [SEARCH__DB#SEARCH__TABLE] in schedule now, Can't execute version [1]
I0704 17:44:33.262889 871433 /root/platfom/worker/source/scheduler/scheduler.cc:397] Schedule [1] << [1] Succ!
I0704 17:44:33.262966 871433 /root/platfom/worker/source/scheduler/scheduler.cc:212] SEARCH__DB#SEARCH__TABLE, FinishVersionSchedule
=================================================================
==871423==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e000007091 at pc 0x0000005c0c7f bp 0x7fac054f8d50 sp 0x7fac054f8d48
READ of size 1 at 0x60e000007091 thread T6
I0704 17:44:33.274636 871434 /root/platfom/worker/source/scheduler/generator.cc:43] CreateTableGroup SEARCH__DB|SEARCH__TABLE
I0704 17:44:33.276284 871434 /root/platfom/worker/source/scheduler/generator.cc:188] CreateTable [SEARCH__DB#SEARCH__TABLE#8.9]
    #0 0x5c0c7e in ScheduleSet::FinishSchedule(bool) /root/platfom/worker/source/scheduler/generator.h:15:13
    #1 0x5ba2e0 in ScheduleWrapper::End() /root/platfom/worker/source/scheduler/schedule.cc:210:20
    #2 0x5fb2d8 in WorkerScheduler::ScheduleThread() /root/platfom/worker/source/scheduler/scheduler.cc:406:31
    #3 0x61b87f in void std::__invoke_impl<void, void (WorkerScheduler::*)(), WorkerScheduler*>(std::__invoke_memfun_deref, void (WorkerScheduler::*&&)(), WorkerScheduler*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:73:14
    #4 0x61b6f1 in std::__invoke_result<void (WorkerScheduler::*)(), WorkerScheduler*>::type std::__invoke<void (WorkerScheduler::*)(), WorkerScheduler*>(void (WorkerScheduler::*&&)(), WorkerScheduler*&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:95:14
    #5 0x61b6b1 in decltype(std::__invoke(_S_declval<0ul>(), _S_declval<1ul>())) std::thread::_Invoker<std::tuple<void (WorkerScheduler::*)(), WorkerScheduler*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/thread:244:13
    #6 0x61b664 in std::thread::_Invoker<std::tuple<void (WorkerScheduler::*)(), WorkerScheduler*> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/thread:253:11
    #7 0x61b398 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (WorkerScheduler::*)(), WorkerScheduler*> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/thread:196:13
    #8 0x7fac0b585de3  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6de3)
    #9 0x7fac0b33b608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x9608)
    #10 0x7fac0b23f292 in clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

0x60e000007091 is located 81 bytes inside of 152-byte region [0x60e000007040,0x60e0000070d8)
freed by thread T2 here:
    #0 0x570762 in operator delete(void*) (/root/platfom/build/bin/pressure_main+0x570762)
    #1 0x64dccf in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2> >::deallocate(std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/ext/new_allocator.h:125:2
    #2 0x64dc9f in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2> > >::deallocate(std::allocator<std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2> >&, std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/alloc_traits.h:462:13
    #3 0x64ce2c in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/allocated_ptr.h:73:4
    #4 0x64d2a6 in std::_Sp_counted_ptr_inplace<ScheduleSet, std::allocator<ScheduleSet>, (__gnu_cxx::_Lock_policy)2>::_M_destroy() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/shared_ptr_base.h:564:7
    #5 0x58f2cd in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/shared_ptr_base.h:171:10
    #6 0x58f1a8 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/shared_ptr_base.h:728:11
    #7 0x60a098 in std::__shared_ptr<ScheduleSet, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/shared_ptr_base.h:1167:31
    #8 0x5fec14 in std::shared_ptr<ScheduleSet>::~shared_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/shared_ptr.h:103:11
    #9 0x60a5d2 in std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >::~pair() /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/stl_iterator.h:1262:12
    #10 0x60a5a8 in void __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true> >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >*) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/ext/new_allocator.h:140:28
    #11 0x60a4d7 in void std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > >(std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true> >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >*) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/alloc_traits.h:487:8
    #12 0x60a469 in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true>*) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/hashtable_policy.h:2111:7
    #13 0x648d40 in std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, true>*) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/hashtable.h:1903:13
    #14 0x6488b5 in std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_erase(std::integral_constant<bool, true>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/hashtable.h:1929:7
    #15 0x6486dc in std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::erase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/hashtable.h:766:16
    #16 0x5ff83c in std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::shared_ptr<ScheduleSet>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::shared_ptr<ScheduleSet> > > >::erase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/unordered_map.h:815:21
    #17 0x5f3b41 in WorkerScheduler::FinishVersionSchedule(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) /root/platfom/worker/source/scheduler/scheduler.cc:213:26
    #18 0x8ac2da in void std::__invoke_impl<void, void (WorkerScheduler::*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), WorkerScheduler*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool>(std::__invoke_memfun_deref, void (WorkerScheduler::*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), WorkerScheduler*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:73:14
    #19 0x8ac04b in std::__invoke_result<void (WorkerScheduler::*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), WorkerScheduler*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool>::type std::__invoke<void (WorkerScheduler::*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), WorkerScheduler*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool>(void (WorkerScheduler::*&)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), WorkerScheduler*&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:95:14
    #20 0x8abf04 in void std::_Bind<void (WorkerScheduler::* (WorkerScheduler*, std::_Placeholder<1>, std::_Placeholder<2>))(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)>::__call<void, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&, 0ul, 1ul, 2ul>(std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&>&&, std::_Index_tuple<0ul, 1ul, 2ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/functional:400:11
    #21 0x8abcd4 in void std::_Bind<void (WorkerScheduler::* (WorkerScheduler*, std::_Placeholder<1>, std::_Placeholder<2>))(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)>::operator()<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, void>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/functional:482:17
    #22 0x8ab5a6 in std::_Function_handler<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool), std::_Bind<void (WorkerScheduler::* (WorkerScheduler*, std::_Placeholder<1>, std::_Placeholder<2>))(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)> >::_M_invoke(std::_Any_data const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/std_function.h:297:2
    #23 0x5c98a9 in std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)>::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) const /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/std_function.h:687:14
    #24 0x5c0fed in ScheduleSet::FinishSchedule(bool) /root/platfom/worker/source/scheduler/generator.h:19:13
    #25 0x5ba2e0 in ScheduleWrapper::End() /root/platfom/worker/source/scheduler/schedule.cc:210:20
    #26 0x5bbfd1 in ScheduleWrapper::Run() /root/platfom/worker/source/scheduler/schedule.cc:286:9
    #27 0x5bc114 in RunSchedule(std::shared_ptr<ScheduleWrapper>) /root/platfom/worker/source/scheduler/schedule.cc:293:39
    #28 0x65c465 in std::future<ScheduleStatus> thread_pool::submit<ScheduleStatus (std::shared_ptr<ScheduleWrapper>), std::shared_ptr<ScheduleWrapper>, ScheduleStatus, void>(ScheduleStatus  const(&)(std::shared_ptr<ScheduleWrapper>), std::shared_ptr<ScheduleWrapper> const&)::'lambda'()::operator()() const /root/platfom/common/utils/thread_pool.hpp:270:51
    #29 0x65bf2c in std::_Function_handler<void (), std::future<ScheduleStatus> thread_pool::submit<ScheduleStatus (std::shared_ptr<ScheduleWrapper>), std::shared_ptr<ScheduleWrapper>, ScheduleStatus, void>(ScheduleStatus  const(&)(std::shared_ptr<ScheduleWrapper>), std::shared_ptr<ScheduleWrapper> const&)::'lambda'()>::_M_invoke(std::_Any_data const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/std_function.h:297:2

thread T2

ScheduleWrapper::Run 时执行了FinishVersionSchedule

table_str 对应的 ScheduleSet 删除。

During the execution of ScheduleWrapper::Run, the FinishVersionSchedule function is called, which removes the ScheduleSet corresponding to the table_str.

thread T6

再次调用了已经被T2释放的ScheduleSet 对象成员函数,并修改了对应类成员。引起异常行为。

An exception occurred when it use  ScheduleSet object that had already been released by T2.

Additionally, it modifies the object ScheduleSet data, resulting in unexpected behavior.

WorkerScheduler结构

再梳理下整体逻辑框架,为什么会出现这种情况。

Let’s review the overall logical framework to understand why this situation might occur.

整个结构中

In the overall structure:

image

hb_thread

  • 负责将创建一个table的任务集 ScheduleSet。如一个table1 同时有2个版本更新任务,创建一个 ScheduleSet  下挂两个 ScheduleWrapper 放记录TableName 下共2个任务 running_count 。 放到schedule_pool_  执行。

Responsible for creating a ScheduleSet for a table. For example, if table1 has two version update tasks at same time, it creates a ScheduleSet and attaches two ScheduleWrapper instances under the record TableName. running_count is 2, shows two tasks for a ScheduleSet. These tasks are then executed by the schedule_pool_.

  • 记录running_version_map_(key: TableName,value: schedule_set); 

Maintains the running_version_map_, key: TableName,value: schedule_set; 

  • 记录schedule_map_(key: ScheduleWrapperID, value: ScheduleWrapper);

Maintains the schedule_map_, key: ScheduleWrapperID, value: ScheduleWrapper;

schedule_thread

  • 负责轮询 schedule_map_ 里面的 ScheduleWrapper 状态,并在任务完成后将ScheduleSet running_count 减少一个,如果running_count == 0,删除running_version_map_ 里面的schedule_set

Responsible for polling the status of ScheduleWrapper instances in the schedule_map_. After a task is completed, it reduces the running_count of the corresponding ScheduleSet by 1. If running_count becomes zero, it deletes the corresponding schedule_set from the running_version_map_.

schedule_pool_

  • 负责ScheduleWrapper 执行,ScheduleWrapper  执行,同样在任务完成后将ScheduleSet running_count 减少一个,如果running_count == 0,删除running_version_map_ 里面的schedule_set

Responsible for executing  task ScheduleWrapper. After a task is completed, it reduces the running_count of the corresponding ScheduleSet by one. If running_count becomes zero, it deletes the corresponding schedule_set from the running_version_map_.

时序图:

image 可以看出

running_count 本身是用来标记ScheduleSet 下有几个正在执行的ScheduleWrapper

it shows that the running_count is used to indicate the number of currently executing ScheduleWrapper instances under a ScheduleSet.

问题就出在ScheduleWrapper 执行完 schedule_thread  schedule_pool_ 都会调用End Schedule, 将ScheduleSet running_count 减少一个

The problem occurs when both the schedule_thread and schedule_pool_ call the End Schedule function after the execution of a ScheduleWrapper, reducing the running_count of the ScheduleSet.

实际上算错了,一个ScheduleWrapper执行完,减少2,提前触发ScheduleSet 析构。

However, there is an error in the calculation: instead of reducing it by one, it is mistakenly reduced by two. As a result, the destruction of the ScheduleSet is triggered prematurely.

而同一个ScheduleSet 下的其他ScheduleWrapper 还仍然在执行,执行的ScheduleWrapper执行完 又会调用ScheduleSet成员函数,并修改里面的数据产生未定义行为

Meanwhile, other ScheduleWrapper instances under the same ScheduleSet may still be executing. When those ScheduleWrapper instances complete their execution, they also invoke member functions of the ScheduleSet, leading to modifications of its data and resulting in undefined behavior.

经验

对于调试信息缺失,又难复现的case,推荐借助工具ASAN/valgrind/VTune

For cases where debugging information is not enough and the issue is difficult to reproduce, it is recommended to use tools such as ASAN (AddressSanitizer), Valgrind, or VTune.

  • use as below
    • in CMakeList.txt
      if(ENABLE_ASAN)
      add_compile_options(-fsanitize=address -fsanitize-recover=address -fno-stack-protector -fno-omit-frame-pointer -fno-var-tracking -g)
      add_link_options(-fsanitize=address -g )
      endif()
      
    • before run program

export ASAN_OPTIONS=halt_on_error=0:use_sigaltstack=0:detect_leaks=1:malloc_context_size=15:log_path=./asan.log:suppressions=$SUPP_FILE `

Reference

文档信息

Search

    Table of Contents