perl查询空表引起的bug

在ali集团对数据库的数据采集、监控、分析等由xxagent完成,所以对xxagent运行稳定非常重要。下面介绍xxagent一个bug,导致数据查询异常。

1、数据库的lijie表中没有记录:

root@xx 10:28:26>show create table lijie \G;
*************************** 1. row ***************************
       Table: lijie
Create Table: CREATE TABLE `lijie` (
  `id` int(11) DEFAULT NULL,
  `name` varchar(32) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=gbk;
root@xx 10:28:30>select count(*) from lijie;
+----------+
| count(*) |
+----------+
|        0 |
+----------+

2、执行以下代码:

my $sql3 = "select id,name from lijie";
my @sql3re = DB::query($sql3);
$Common::log->debug("luoxuansun3->@sql3re->") if (@sql3re);
$Common::log->debug("luoxuansun4") if (DB::query($sql3));
my $sql3n  = scalar @sql3re;
my $sql3nn = @sql3re;
my $sql3nnn = DB::query($sql3);
$Common::log->debug("luoxuansun5"."->".$sql3n."->".$sql3nn."->".$sql3nnn."aa");

3、log信息

$xxagent logf|grep luoxuansun
2013-07-28 10:39:50 [DEBUG] luoxuansun3->-> 
2013-07-28 10:39:50 [DEBUG] luoxuansun5->1->1->aa
2013-07-28 10:39:52 [DEBUG] luoxuansun3->->
2013-07-28 10:39:52 [DEBUG] luoxuansun5->1->1->aa

4、现象
通过执行DB模块的query函数返回给@sql3re后,里面既然有了记录(实际是没有记录),但直接执行”DB::query($sql3)”却是没有数据,前后矛盾

5、写入一条记录

root@xx 10:28:57>insert into lijie values(1,'a');
Query OK, 1 row affected (0.00 sec)
log信息
2013-07-28 10:44:39 [DEBUG] luoxuansun3->HASH(0x1bbb1f8)-> 
2013-07-28 10:44:39 [DEBUG] luoxuansun4
2013-07-28 10:44:39 [DEBUG] luoxuansun5->1->1->1aa
注意:结果"luoxuansun5->1->1->1aa"并没有发生变化,但"luoxuansun4"打印出来,说明的确有数据存在

6、再写一条记录

root@xx 10:44:35>insert into lijie values(1,'b');
Query OK, 1 row affected (0.00 sec)
log信息
2013-07-28 10:46:53 [DEBUG] luoxuansun3->HASH(0x1bd6628) HASH(0x1bd7df8)->  
2013-07-28 10:46:53 [DEBUG] luoxuansun4 at  
2013-07-28 10:46:53 [DEBUG] luoxuansun5->2->2->2aa 
注意:结果集中有两条记录了,变化符合预期

7、排查DB的query函数

sub query { 
  my $sql = join '', @_; 
  if ( !defined($dbh) ) {
    $dbh = _connect;
  }
  for ( my $count = 1 ; $count <= $connect_retries ; $count++ ) {
    if ( defined($dbh) ) {
      my @res = $dbh->query($sql);
      my $x = scalar @res;
      $Common::log->debug("luoxuansat>$sql->$x");
      if ( scalar(@res) ) {
        $Common::log->debug("luoxuansat2->$sql->$x");
        return @res;
      } else {
        $dbh = _connect;
        $Common::log->error("query retires $count $sql"); 
        next;   
      }
    } else {
      $dbh = _connect;
      $Common::log->error("query retires $count $sql"); 
      next;   
    }
  }
}
 
问题是出在空表时,先把记录删除,看一下log信息
root@xx 10:46:44>delete from lijie;
Query OK, 2 rows affected (0.00 sec)
log信息
$xxagent logf|grep luoxuansat|grep lijie
2013-07-28 10:53:31 [DEBUG] luoxuansat>select id,name from lijie->0 
2013-07-28 10:53:31 [DEBUG] luoxuansat>select id,name from lijie->0
注意:调用$dbh->query返回记录集为0,那说问题就出在DB的query函数上了
log信息
$xxagent logf|grep "query retires"|grep lijie
2013-07-28 10:56:18 [ERROR] query retires 1 select id,name from lijie 
2013-07-28 10:56:18 [ERROR] query retires 2 select id,name from lijie 
2013-07-28 10:56:18 [ERROR] query retires 3 select id,name from lijie 
注意:结果为0的情况下,却走到了:
else {
        $dbh = _connect;
        $Common::log->error("query retires $count $sql"); 
        next;   
      }
而没有结果集返回

8、bug fixed:
但没有结果返回时,应该明确返回空数组给调用方或对空结果做特殊处理

sub query { 
  my $sql = join '', @_; 
  my @nores;   --设置空数组
  if ( !defined($dbh) ) {
    $dbh = _connect;
  }
  for ( my $count = 1 ; $count <= $connect_retries ; $count++ ) {
    if ( defined($dbh) ) {
      my @res = $dbh->query($sql);
      my $x = scalar @res;
      $Common::log->debug("luoxuansat>$sql->$x");
      if ( scalar(@res) ) {
        $Common::log->debug("luoxuansat2->$sql->$x");
        return @res;
      } else {
        $dbh = _connect;
        $Common::log->error("query retires $count $sql"); 
        next;   
      }
    } else {
      $dbh = _connect;
      $Common::log->error("query retires $count $sql"); 
      next;   
    }
   return @nores;--无结果集时返回
  }

9、修改效果

$xxagent logf|grep luoxuansun
2013-07-28 11:06:45 [DEBUG] luoxuansun5->0->0->0aa 
2013-07-28 11:06:48 [DEBUG] luoxuansun5->0->0->0aa 
2013-07-28 11:07:30 [DEBUG] luoxuansun5->0->0->0aa 
下面代码不再执行打印
$Common::log->debug("luoxuansun3->@sql3re->") if (@sql3re);
写入一条记录
root@xx 11:11:05>insert into lijie values(1,'a');
Query OK, 1 row affected (0.00 sec)
log信息
$xxagent logf|grep luoxuansun
2013-07-28 11:12:14 [DEBUG] luoxuansun3->HASH(0x212f5f0)->  
2013-07-28 11:12:14 [DEBUG] luoxuansun4 at  
2013-07-28 11:12:14 [DEBUG] luoxuansun5->1->1->1aa 
#xxagent logf|grep luoxuansat|grep lijie
2013-07-28 11:12:14 [DEBUG] luoxuansat>select id,name from lijie->1  
2013-07-28 11:12:14 [DEBUG] luoxuansat2->select id,name from lijie->1  
再写入一条
root@xx 11:11:11>insert into lijie values(1,'b');
Query OK, 1 row affected (0.00 sec)
log信息
$xxagent logf|grep luoxuansun
2013-07-28 11:14:51 [DEBUG] luoxuansun3->HASH(0x2140e58) HASH(0x21408a0)->  
2013-07-28 11:14:51 [DEBUG] luoxuansun4 at  
2013-07-28 11:14:51 [DEBUG] luoxuansun5->2->2->2aa 
#xxagent logf|grep luoxuansat|grep lijie
2013-07-28 11:14:51 [DEBUG] luoxuansat>select id,name from lijie->2  
2013-07-28 11:14:51 [DEBUG] luoxuansat2->select id,name from lijie->2

10、小结
上面把很多代码都省略,其实原本的代码函数之间调用还要复杂,为了把问题尽量简单,没有例出。问题总是有它的原因,只要你坚持。


Post a Comment