MINSCN与Cache Fusion Read Consistent-白红宇

MINSCN与Cache Fusion Read Consistent

阅读量：6642 次

发布时间：2019-06-25

本文共 11482 字，大约阅读时间需要 38 分钟。

，网友提出了一个演示，我们在11.2.0.3 2 Node RAC的环境中重现这个实验：

SQL> select * from v$version;BANNER--------------------------------------------------------------------------------Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit ProductionPL/SQL Release 11.2.0.3.0 - ProductionCORE    11.2.0.3.0      ProductionTNS for Linux: Version 11.2.0.3.0 - ProductionNLSRTL Version 11.2.0.3.0 - ProductionSQL> select * from global_name;GLOBAL_NAME--------------------------------------------------------------------------------www.oracledatabase12g.comSQL> drop table test purge;Table dropped.SQL> alter system flush buffer_cache;System altered.SQL> create table test(id number);insert into test values(1);insert into test values(2);commit;/* 我们利用 rowid定位TEST表仅有的2行数据的数据块位置 */select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)------------------------------------ ------------------------------------                               89233                                    1                               89233                                    1 SQL> alter system flush buffer_cache;System altered.Instance 1  Session A 执行UPDATE操作：SQL> update test set id=id+1 where id=1;1 row updated.Instance 1  Session B 查询x$BH buffer header视图 了解 相关Buffer的状态 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         1          0         3    1227595

X$BH 视图的 STATE字段代表Buffer的状态，以下是状态列表：

STATE       NUMBER           KCBBHFREE         0       buffer free           KCBBHEXLCUR       1       buffer current (and if DFS locked X)           KCBBHSHRCUR       2       buffer current (and if DFS locked S)           KCBBHCR           3       buffer consistant read           KCBBHREADING      4       Being read           KCBBHMRECOVERY    5       media recovery (current & special)           KCBBHIRECOVERY    6       Instance recovery (somewhat special)

这个演示中我们需要用到的是：　state =1 Xcurrent 、 state=2 Scurrent 、 state=3 CR 接着在 Instance 2 更新同一个数据块内的另一条记录，这回引发 gc current block 2 way 并将Current Block 传输到 Instance 2，同时 Instance 1 的原"Current Block" Convert 成 Past Image:

Instance 2 Session C SQL> update test set id=id+1 where id=2;1 row updated.Instance 2 Session DSQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         1          0         3    1227641         3    1227638

STATE =1 的Xcurrent block已传输到 Instance 2 ，再来看 Instance 1 此时的 GC状态：

Instance 1 Session B SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         3    1227641         3    1227638         8          0         3    1227595

问题出现在这里，当网友再次在Instance 1上的session A中执行对TEST表的SELECT查询后，发现原来的 3个 State=3的CR 块数量减少到了1个：

Instance 1 session A 即最初执行UPDATE的 sessionSQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';Session altered.SQL> select * from test;        ID----------         2         2select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;      STATE CR_SCN_BAS---------- ----------         3    1227716         3    1227713         8          0

网友在原帖中是通过v$BH 视图观察CR块的数量，发现在执行SELECT查询后 CR块数量反而减少了，故而产生了疑问。我们在以上演示中直接观察X$BH视图可以发现，原本的三个CR块的SCN Version分别为： 1227641、1227638、1227595，在SELECT查询完成后被 2个不同SCN version的CR块 1227716和 1227713 所替换， Oracle为什么要这样做呢？所幸我们在实际执行SELECT查询前设置了event 10708和 rac.*的诊断TRACE，我们先来看看TRACE内容：

PARSING IN CURSOR #140444679938584 len=337 dep=1 uid=0 oct=3 lid=0 tim=1335698913632292 hv=3345277572 ad='bc0e68c8' sqlid='baj7tjm3q9sn4'SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ NO_PARALLEL("TEST") FULL("TEST") NO_PARALLEL_INDEX("TEST") */ 1 AS C1, 1 AS C2 FROM "SYS"."TEST" "TEST") SAMPLESUBEND OF STMTPARSE #140444679938584:c=1000,e=27630,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632252EXEC #140444679938584:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632390*** 2012-04-29 07:28:33.632kclscrs: req=0 block=1/89233*** 2012-04-29 07:28:33.632kclscrs: bid=1:3:1:0:7:80:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:3f:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7a:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:02012-04-29 07:28:33.632578 : kjbcrc[0x15c91.1 76896.0][9]2012-04-29 07:28:33.632616 : GSIPC:GMBQ: buff 0xba1e8f90, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f2782012-04-29 07:28:33.632634 : kjbmscrc(0x15c91.1)seq 0x2 reqid=0x1c(shadow 0xb4bb4458,reqid x1c)mas@2(infosz 200)(direct 1)2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]2012-04-29 07:28:33.632669 : GSIPC:SENDM: send msg 0xba1e9000 dest x20001 seq 24026 type 32 tkts xff0000 mlen x17001a02012-04-29 07:28:33.633385 : GSIPC:KSXPCB: msg 0xba1e9000 status 30, type 32, dest 2, rcvr 1*** 2012-04-29 07:28:33.633kclwcrs: wait=0 tm=689*** 2012-04-29 07:28:33.633kclwcrs: got 1 blocks from ksxprcvWAIT #140444679938584: nam='gc cr block 2-way' ela= 689 p1=1 p2=89233 p3=1 obj#=76896 tim=13356989136334182012-04-29 07:28:33.633490 : kjbcrcomplete[0x15c91.1 76896.0][0]2012-04-29 07:28:33.633510 : kjbrcvdscn[0x0.12bbc1][from 2][idx 2012-04-29 07:28:33.633527 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc1][from 2]*** 2012-04-29 07:28:33.633kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=689

通过TRACE不难发现因为之前没有收集过TEST表的统计信息，所以这里出发了Dynamic Sampling的动态采样，这本身会引发对TEST表的 CR读请求，实际产生了一次'gc cr block 2-way' 等待： 2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2] 12bbc1= 1227713 与上述X$BH中的一个CR块对应，kjbsentscn[0x0.12bbc1][to 2] 可以理解为向 Instance 2 发送了SCN=12bbc1=1227713 DBA=0x15c91.1 76896.0 的 CR Request(obj#=76896) 之后kjbrcvdscn函数确认了 [no bscn <= rscn 0x0.12bbc1][from 2] ，即没有比已receive的 SCN Version =12bbc1 更好的Best Version CR Server Arch 动态采样完成后才真正执行了用户发出的SELECT语句:

PARSING IN CURSOR #140444682869592 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335698913635874 hv=1689401402 ad='b1a188f0' sqlid='c99yw1xkb4f1u'select * from testEND OF STMTPARSE #140444682869592:c=4999,e=34017,p=0,cr=7,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635870EXEC #140444682869592:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635939WAIT #140444682869592: nam='SQL*Net message to client' ela= 7 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335698913636071*** 2012-04-29 07:28:33.636kclscrs: req=0 block=1/89233*** 2012-04-29 07:28:33.636kclscrs: bid=1:3:1:0:7:83:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:2:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7d:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:02012-04-29 07:28:33.636209 : kjbcrc[0x15c91.1 76896.0][9]2012-04-29 07:28:33.636228 : GSIPC:GMBQ: buff 0xba0e5d50, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f2782012-04-29 07:28:33.636244 : kjbmscrc(0x15c91.1)seq 0x3 reqid=0x1d(shadow 0xb4bb4458,reqid x1d)mas@2(infosz 200)(direct 1)2012-04-29 07:28:33.636252 : kjbsentscn[0x0.12bbc4][to 2]2012-04-29 07:28:33.636358 : GSIPC:SENDM: send msg 0xba0e5dc0 dest x20001 seq 24029 type 32 tkts xff0000 mlen x17001a02012-04-29 07:28:33.636861 : GSIPC:KSXPCB: msg 0xba0e5dc0 status 30, type 32, dest 2, rcvr 1*** 2012-04-29 07:28:33.637kclwcrs: wait=0 tm=865*** 2012-04-29 07:28:33.637kclwcrs: got 1 blocks from ksxprcvWAIT #140444682869592: nam='gc cr block 2-way' ela= 865 p1=1 p2=89233 p3=1 obj#=76896 tim=13356989136372942012-04-29 07:28:33.637356 : kjbcrcomplete[0x15c91.1 76896.0][0]2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc4][from 2]*** 2012-04-29 07:28:33.637kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=865

类似的， "SELECT * FROM TEST"也引发了一次'gc cr block 2-way'等待： 2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn 最后Foreground Process从 Remote LMS哪里got的是 SCN=1227716 Version的CR, 同样与之前我们从X$BH 视图查到的scn对应。这样就可以解释为什么Instance 1上出现了2个SCN更大的CR块，但仍无法解释原来存在于Instance 1 Buffer Cache中的三个SCN Version 较小的CR 块消失的原因。我们来看下面的演示：

SQL> alter system set "_enable_minscn_cr"=false scope=spfile;System altered.SQL> alter system set "_db_block_max_cr_dba"=20 scope=spfile;System altered.SQL> startup force;ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instanceORACLE instance started.Total System Global Area 1570009088 bytesFixed Size                  2228704 bytesVariable Size             989859360 bytesDatabase Buffers          570425344 bytesRedo Buffers                7495680 bytesDatabase mounted.Database opened.

设置以上 "_enable_minscn_cr"=false 和 "_db_block_max_cr_dba"=20 并重启RAC所有实例，重现上述演示：

在Instance 2 Session C 中update更新一次数据块 就对应地在Instance 1 中查询一次 ，以反复在Instance 1中Request CR SQL> update test set id=id+1 where id=2;              -- Instance 21 row updated.SQL> select * From test;                         -- Instance 1        ID----------         1         2下面为 Instance 1的 X$BH记录select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;    STATE CR_SCN_BAS---------- ----------         3    1273080         3    1273071         3    1273041         3    1273039         8          0SQL>  update test set id=id+1 where id=3;1 row updated.SQL> select * From test;        ID----------         1         2SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         3    1273091         3    1273080         3    1273071         3    1273041         3    1273039         8          0...................SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         3    1273793         3    1273782         3    1273780         3    1273769         3    1273734         3    1273715         3    1273691         3    1273679         3    1273670         3    1273643         3    1273635         3    1273623         3    1273106         3    1273091         3    1273080         3    1273071         3    1273041         3    1273039         3    127303319 rows selected.SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;     STATE CR_SCN_BAS---------- ----------         3    1274993

如上述演示在设置了 "_enable_minscn_cr"(enable/disable minscn optimization for CR)=false 和 "_db_block_max_cr_dba"=20 (Maximum Allowed Number of CR buffers per dba) 2个参数后最多的时候 Instance 1 中缓存了同一个数据块的多达 19个版本的CR块。 "_enable_minscn_cr"是11g以后出现的新隐藏参数，它控制Oracle是否计算CR块的最小SCN，当Foreground Process Receive接收到同一个数据块的更新(SCN更大)的SCN Version CR Block时可能会清除CBC上的 SCN较小的、旧的CR块，这样做的目的是减少Buffer Cache中同一个数据块不同版本SCN Version的CR块的数量，注意不管是语句级别或者事务级别其所要求的Snap_Scn 快照 SCN总是语句或事务开始时的Current SCN，保留一些旧的CR块虽然可能对一些持续时间长的查询或者游标有益，但是实例Buffer Cache中同一个数据块的多版本 CR块的总数量是有限的，这个总数受到 "_db_block_max_cr_dba" 隐藏参数的控制，如我们上述演示中设置为20 ，则最多可以在Buffer Cache中缓存多大19个版本的CR块；注意该"_db_block_max_cr_dba" 参数的默认值为6 ，即一个实例Buffer cache中同一个数据块的CR 版本同时不多于6个。引入"_enable_minscn_cr" 优化CR的最小MINSCN 是有其理由的，即便那些版本较旧的CR块被新陈代谢了，但只要需要 Foreground Process还是可以通过CR Request ，要求 Holder Instance LMS 去build一个 BEST CR 来用，这不消我们去担心。

转载地址：http://osevo.baihongyu.com/

你可能感兴趣的文章