While working on processing database large objects (LOBs) for my CDC solution for Oracle Database, I came across Jonathan Lewis's publication xmltype about how Oracle Database stores XMLTYPE data. These publications describe what happens in data blocks but do not touch upon what LGWR does or what is written to the redo log, which is very important in my work on an open-source CDC solution. Oracle Database supports three storage types for XMLTYPE:
Legacy storage of XMLTYPE in text form in a CLOB type column.
Binary XML in CSX format, which you can read more about in Inside Binary XML.
Object-relational storage, which is outside the scope of this article as it does not involve the use of LOBs.
So, what does Oracle Database write to the redo log for XMLTYPE?
Let's use a slightly modified example from Jonathan Lewis's xmltype and create a table that will include three XMLTYPE columns: one will store data in the legacy text CLOB format, the second as Binary XML, and the third as Binary XML with compression enabled.
REM
create table test_xml(
id number,
clob_col xmltype,
bin_col xmltype,
binc_col xmltype
)
xmltype clob_col store as clob
xmltype bin_col store as binary xml
xmltype binc_col store as binary xml(compress)
/
REM we need supplemental LOG DATA for CDC
alter table test_xml add supplemental log data (ALL) columns
/
And now we just insert one row into this table and write down the SCN of the beginning and end of our insertion into the table
REM Please write down this SCN
select current_scn from v$database
/
insert into test_xml
select
rownum,
sys.xmltype(other_xml, null, null, null),
sys.xmltype(other_xml, null, null, null),
sys.xmltype(other_xml, null, null, null)
from
v$sql_plan
where
length(other_xml) > 69000
and rownum < 2
/
REM Please write down this SCN
select current_scn from v$database
/
Now we are ready to get a redo log dump for our insert operation using either the standard Oracle database command
ALTER SYSTEM DUMP LOGFILE '/Users/averemee/polyxena/oracle/oradata/KAFKA19/KAFKA19/redo02.log' SCN MIN 48347954 SCN MAX 48348551;
or a utility included in my open-source CDC solution, which generates additional output about certain data changes.
java -cp oracdc-kafka-2.8.2-standalone.jar solutions.a2.cdc.oracle.utils.file.OraRedoLogFile -f /Users/averemee/polyxena/oracle/oradata/KAFKA19/KAFKA19/redo02.log -o test_xml.trc --start-scn 48347954 --end-scn 48348551 -r -b
In parallel, by running LogMiner, we will get the output from this, standard Oracle Database, utility
BEGIN
DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME => '/opt/oracle/oradata/KAFKA19/redo02' , OPTIONS => DBMS_LOGMNR.NEW);
DBMS_LOGMNR.START_LOGMNR(
STARTSCN => 48314485,
ENDSCN => 48451926,
OPTIONS =>
DBMS_LOGMNR.SKIP_CORRUPTION +
DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG +
DBMS_LOGMNR.NO_SQL_DELIMITER +
DBMS_LOGMNR.NO_ROWID_IN_STMT);
END;
/
And what we see is - XMLTYPE stored as CLOB is written as XML text using database changes of OP:26.6 (you can find the Java parsing code for this change in Layer 26 opcodes 2 and 6), and in the LogMiner output, these changes appear as 'XML DOC WRITE' operation records with code 70. And indeed, this results in a very large amount of redo, and it's clear why Oracle has long discouraged the use of this XMLTYPE storage type. It seems simple, but surprises still lie ahead.
After that we have 6 redo records (RBAs: 0x0002e3.00029526.0110/LEN: 0x20e4, 0x0002e3.00029537.0104/LEN: 0x20e4, 0x0002e3.00029548.00f8/LEN: 0x20e4, 0x0002e3.00029559.00ec/LEN: 0x20e4, 0x0002e3.0002956a.00e0/LEN: 0x20e4, 0x0002e3.0002957b.00d4/LEN: 0x065c
) for XMLTYPE stored in uncompressed binary format. All entries consist of a combination of changes OP:5.1 and OP:26.2 (you can find the Java parsing code for this change vector in Layer 26 opcodes 2 and 6). The total length of these redo records is 0x20e4 * 5 + 0x065c = 43728
less than the size of our original XML (69440 bytes), and as expected redo record at RBA 0x0002e3.00029526.0110
starts with Oracle Binary XML signature 9f 01
REDO RECORD - Thread:1 RBA: 0x0002e3.00029526.0110 LEN: 0x20e4 VLD: 0x01 CON_UID: 3517775930
SCN: 0x0000000002e1bd4d SUBSCN: 5
CHANGE #1 CON_ID:3 TYP:0 CLS:28 AFN:11 DBA:0x03000363 OBJ:4294967295 SCN:0x0000000002e1bd4b SEQ:2 OP:5.1 ENC:0 RBL:0 FLG:0x0000
ktudb redo: siz: 144 spc: 1870 flg: 0x0022 seq: 0x046f rec: 0x2d
xid: 0x0006.002.00000e03
ktubu redo: slt: 2 wrp: 3587 flg: 0x0000 prev dba: 0x00000000 rci: 44 opc: 26.1 [objn: 77378 objd: 77381 tsn: 5]
[Undo type ] Regular undo [User undo done ] No [Last buffer split] No
[Temp object] No [Tablespace Undo ] No [User only ] No
KDLI undo record:
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x00000000.0000.00
KDLI common [12]
op 0x01 [UNDO]
type 0x20 [data]
flg0 0x05
flg1 0x00
psiz 8060
poff 128
dba 0x03000363
KDLI load data [4.56]
bdba [0x03000363]
kdlich
flg0 0x20 [ver=0 typ=data lock=n]
flg1 0x00
scn 0x0000000002e1bd4d [0x0000.0000.02e1bd4d]
lid 0000000100000011ed19
spare 0x00000000
kdlidh
flg2 0x00 [ver=0 lid=short-rowid hash=n cmap=n pfill=n]
flg3 0x00
pskip 0
sskip 0
hash 0000000000000000000000000000000000000000
hwm 0
spr 0
KDLI suplog [9.24]
xid 0x0006.002.00000e03
objn 77378
objv# 1
col# 3
flag 0x00000000
CHANGE #2 CON_ID:3 TYP:9 CLS:1 AFN:12 DBA:0x03000363 OBJ:77381 SCN:0x0000000002e1bd4c SEQ:2 OP:26.2 ENC:0 RBL:0 FLG:0x0000
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x024010be.046f.2d
KDLI common [12]
op 0x00 [REDO]
type 0x20 [data]
flg0 0x03
flg1 0x00
psiz 8060
poff 128
dba 0x03000363
KDLI load data [4.56]
bdba [0x03000363]
kdlich
flg0 0x28 [ver=0 typ=data lock=y]
flg1 0x00
scn 0x0000000002e1bd4d [0x0000.0000.02e1bd4d]
lid 0000000100000011ed19
spare 0x00000000
kdlidh
flg2 0x00 [ver=0 lid=short-rowid hash=n cmap=n pfill=n]
flg3 0x00
pskip 0
sskip 0
hash 0000000000000000000000000000000000000000
hwm 8060
spr 0
KDLI fill [6.8068]
foff 0x0000
fsiz 8060
flen 8060
data
9f 01 63 9e 00 00 00 c8 3e a4 c8 2b ba c0 09 0e af 64 62 5f 76 65 72 73 69 6f
6e 07 31 39 2e 30 2e 30 2e 30 d9 d7 d5 c0 0b 0e af 70 61 72 73 65 5f 73 63 68
...........
...........
But these redo records, the redo records with our XML, are missing from the LogMiner output along with other redo records containing OP:11.17 (you can find the Java parsing code for this change vector in 11.17 LogMiner support for LOB operations) change vectors!
After that we have 2 redo records (RBAs: 0x0002e3.000297cc.0148/LEN: 0x20e4, 0x0002e3.000297dd.013c/LEN: 0x0524
) for XMLTYPE stored in compressed binary format. All entries consist of a combination of changes OP:5.1 and OP:26.2 (you can find the Java parsing code for this change vector in Layer 26 opcodes 2 and 6). The total length of these redo records is 0x20e4 + 0x0524 = 9736,
which is 6 times smaller than the size of our original XML (69440 bytes), and as expected redo record at RBA 0x0002e3.000297cc.0148
starts with zlib signature 78 01
REDO RECORD - Thread:1 RBA: 0x0002e3.000297cc.0148 LEN: 0x20e4 VLD: 0x01 CON_UID: 3517775930
SCN: 0x0000000002e1bd66 SUBSCN: 4
CHANGE #1 CON_ID:3 TYP:0 CLS:28 AFN:11 DBA:0x0300037b OBJ:4294967295 SCN:0x0000000002e1bd65 SEQ:1 OP:5.1 ENC:0 RBL:0 FLG:0x0000
ktudb redo: siz: 144 spc: 864 flg: 0x0022 seq: 0x046f rec: 0x34
xid: 0x0006.002.00000e03
ktubu redo: slt: 2 wrp: 3587 flg: 0x0000 prev dba: 0x00000000 rci: 51 opc: 26.1 [objn: 77378 objd: 77383 tsn: 5]
[Undo type ] Regular undo [User undo done ] No [Last buffer split] No
[Temp object] No [Tablespace Undo ] No [User only ] No
KDLI undo record:
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x00000000.0000.00
KDLI common [12]
op 0x01 [UNDO]
type 0x20 [data]
flg0 0x05
flg1 0x00
psiz 8060
poff 128
dba 0x0300037b
KDLI load data [4.56]
bdba [0x0300037b]
kdlich
flg0 0x20 [ver=0 typ=data lock=n]
flg1 0x00
scn 0x0000000002e1bd66 [0x0000.0000.02e1bd66]
lid 0000000100000011ed1a
spare 0x00000000
kdlidh
flg2 0x00 [ver=0 lid=short-rowid hash=n cmap=n pfill=n]
flg3 0x00
pskip 0
sskip 0
hash 0000000000000000000000000000000000000000
hwm 0
spr 0
KDLI suplog [9.24]
xid 0x0006.002.00000e03
objn 77378
objv# 1
col# 4
flag 0x00000000
CHANGE #2 CON_ID:3 TYP:9 CLS:1 AFN:12 DBA:0x0300037b OBJ:77383 SCN:0x0000000002e1bd66 SEQ:2 OP:26.2 ENC:0 RBL:0 FLG:0x0000
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x024010be.046f.34
KDLI common [12]
op 0x00 [REDO]
type 0x20 [data]
flg0 0x03
flg1 0x00
psiz 8060
poff 128
dba 0x0300037b
KDLI load data [4.56]
bdba [0x0300037b]
kdlich
flg0 0x28 [ver=0 typ=data lock=y]
flg1 0x00
scn 0x0000000002e1bd66 [0x0000.0000.02e1bd66]
lid 0000000100000011ed1a
spare 0x00000000
kdlidh
flg2 0x00 [ver=0 lid=short-rowid hash=n cmap=n pfill=n]
flg3 0x00
pskip 0
sskip 0
hash 0000000000000000000000000000000000000000
hwm 8060
spr 0
KDLI fill [6.8068]
foff 0x0000
fsiz 8060
flen 8060
data
78 01 cd 5d cb af a5 c7 51 9f b9 a3 90 27 96 12 20 90 f7 e4 f8 c6 76 b0 62 9d
ef 7d 0e 8e 82 cf e3 3b 33 f7 fa 3e 26 f7 91 1b 63 c4 91 9d 38 b1 35 76 e2 d8
...........
...........
But these redo records, as is the case with storing our XML in uncompressed binary format, are missing from the LogMiner output along with other redo records containing OP:11.17 (you can find the Java parsing code for this change vector in 11.17 LogMiner support for LOB operations) change vectors!
Luckily, our short transaction with just one INSERT statement also contains additional redo records with type OP:, and these are large in size, and their RBA is present in the LogMiner output!
We have two triplets of consecutive change vectors OP:24.8 (you can find the Java parsing code for this change vector in KRVXML - xmlredo - doc or dif - opcode), one triplet for the column with INTERNAL_ID 4 (RBAs: 0x0002e3.000297e0.0108/LEN:0x8054, 0x0002e3.00029822.017c/LEN:0x8054, 0x0002e3.00029865.0010/LEN:LEN: 0x0e54
) and in the LogMiner output these changes correspond to the changes of our column BIN_COL stored as binary uncompressed XML, the second екшздуе for the column with INTERNAL_ID 6 (RBAs: 0x0002e3.0002986c.00d4/LEN:0x8054, 0x0002e3.000298ae.0148/LEN:0x8054, 0x0002e3.000298f0.01bc/LEN:LEN: 0x0e54
) and in the LogMiner output these changes correspond to the changes of our column BINC_COL. The data in each triple is the same size 0x8054 * 2 + 0x0e54 = 69372
, which is not surprising – after all, we were inserting the same XML text into them. It is slightly larger than the size of the original XML (69440 bytes) because each redo record has its own header, the change vector itself also has a header, and it stores additional technical information within itself. And simply decoding the data in OP:24.8 (you can find the Java parsing code for this change vector in KRVXML - xmlredo - doc or dif - opcode)
REDO RECORD - Thread:1 RBA: 0x0002e3.000297e0.0108 LEN: 0x8054 VLD: 0x01 CON_UID: 3517775930
SCN: 0x0000000002e1bd66 SUBSCN: 7
CHANGE #1 MEDIA RECOVERY MARKER CON_ID:3 SCN:0x0000000000000000 SEQ:0 OP:24.8 ENC:0 FLG:0x0000
XML redo opcode type: 1 xid:0x0006.002.00000e03 obj:77378
internal column id: 4
XML data load
3c 6f 74 68 65 72 5f 78 6d 6c 3e 3c 69 6e 66 6f 20 74 79 70 65 3d 22 64 62 5f
...........
...........
provides the following result
<other_xml><info type>"db_
So, what's the conclusion? Despite significant improvements in storing XMLTYPE in binary format, especially when using compression, all this is undone by the fact that to support Change Data Capture, the database writes the original XML to the redo log. The redo log is a bottleneck in the database I/O system. And therefore, it's worth reconsidering – is it really right to store files in the database? After all, there are file systems for this.
P.S. This isn't the only curiosity from my little research – my short transaction doesn't contain the combination of changes 5.1 + 11.2 typical for INSERT statement. Things aren't simple in Oracle :-)
Good show!