Oracle 26ai Vector Indexes

In Oracle 26ai There are two types of indexes when dealing with vector data:

  • IVF Inverted File Flat it’s a disk based index
  • Hierarchical Navigable Small World (HNSW) index  an in-memory based index

Let us first explore IVF index:

I will create a dummy table called galaxies and insert some data:

sqlplus vector_user/vectorai@FREEPDB1

create table galaxies (id number, name varchar2(50), doc varchar2(500), embedding vector);

insert into galaxies (id, name,doc, embedding)

select 1,’M31′,’Messier 31 is a barred spiral galaxy in the Andromeda constellation which has a lot of barred spiral galaxies’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 31 is a barred spiral galaxy in the Andromeda constellation which has a lot of barred spiral galaxies’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 2,’M33′,’Messier 33 is a spiral galaxy in the Triangulum constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 33 is a spiral galaxy in the Triangulum constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 3,’M58′,’Messier 58 is an intermediate barred spiral galaxy in the Virgo constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 58 is an intermediate barred spiral galaxy in the Virgo constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 4,’M63′,’Messier 63 is a spiral galaxy in the Canes Venatici constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 63 is a spiral galaxy in the Canes Venatici constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 5,’M77′,’Messier 77 is a barred spiral galaxy in the Cetus constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 77 is a barred spiral galaxy in the Cetus constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 6,’M91′,’Messier 91 is a barred spiral galaxy in the Coma Berenices constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 91 is a barred spiral galaxy in the Coma Berenices constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 7,’M49′,’Messier 49 is a giant elliptical galaxy in the Virgo constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 49 is a giant elliptical galaxy in the Virgo constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 8,’M60′,’Messier 60 is an elliptical galaxy in the Virgo constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Messier 60 is an elliptical galaxy in the Virgo constellation’ as data));

commit;

insert into galaxies (id, name,doc, embedding)

select 9,’NGC1073′,’NGC 1073 is a barred spiral galaxy in Cetus constellation’,TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘NGC 1073 is a barred spiral galaxy in Cetus constellation’ as data));

commit;

Then, let us run explain plan against a SELECT Query as shown below:

explain plan for select id, name,DOC

from galaxies

order by vector_distance(embedding, TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Vigro’ as data)), COSINE)

fetch approx first 2 rows only;

select * from table(dbms_xplan.display);

A Full table scan has occurred as shown below:

// don’t execute the following command in production environment as it clears out data from memory !!

SQL> alter system FLUSH BUFFER_CACHE;

// run the following query to check performance without index:

SQL> set timing on

SQL> select id, name,DOC

from galaxies

order by vector_distance(embedding, TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Vigro’ as data)), COSINE)

fetch approx first 2 rows only;

The execution time was 240 millisecond

Will create an index:

SQL> CREATE VECTOR INDEX galaxies_ivf_idx ON galaxies (embedding) ORGANIZATION NEIGHBOR PARTITIONS

DISTANCE COSINE

WITH TARGET ACCURACY 95;

// don’t execute the following command in production environment as it clears out data from memory !!

SQL> alter system FLUSH BUFFER_CACHE;

SQL> alter system flush shared_pool;

// checking the run time of the SELECT statement after adding an index:

SQL> set timing on

SQL> select id, name,DOC

from galaxies

order by vector_distance(embedding, TO_VECTOR(VECTOR_EMBEDDING(doc_model USING ‘Vigro’ as data)), COSINE)

fetch approx first 2 rows only;

The execution time was 37 milliseconds

So, indexes are very important when dealing with vector data in your Oracle database system.

Overcoming Blocking & Locking in Oracle 19c with max_idle_blocker_time parameter

To illustrate and simulate locking and blocking let me first prepare a dummy table and two database accounts for simulation:

create table SH1.purchase_orders ( product varchar2(20), sales number (5));

SQL> insert into SH1.purchase_orders values (‘milk’, 20);

SQL> insert into SH1.purchase_orders values (‘corn_flakes’, 40);

SQL> insert into SH1.purchase_orders values (‘cheese’, 16);

SQL> create user jerry identified by jerry;

SQL> grant create session to jerry;

SQL> grant select,update on SH1.purchase_orders to jerry;

SQL> create user donald identified by donald;

SQL> grant create session to donald;

SQL> grant select,update on SH1.purchase_orders to donald;

// as shown below both max_idle_time, max_idle_blocker_time are set to “0” which means unlimited

SQL> show parameter idle

NAME                                 TYPE        VALUE

———————————— ———– ——————————

max_idle_blocker_time                integer     0

max_idle_time                        integer     0

In the first session as user jerry and auto-commit is to “FALSE” as shown below so you will need to explicitly end the transaction either by “commit” or “rollback”:

In the second session as user donald [ donald session is hanging because there is already shared row lock on the table]

To find blocking session:

SQL> select * from v$lock where block >0;

SQL> select SID,USERNAME,STATUS,PROGRAM from v$session where SID=290;

To  find locking:

SQL> select * from V$SESSION_EVENT where WAIT_CLASS=’Application’ and EVENT like ‘enq: TX – row lock contention’;

SID 283 is for account “donald”

So, based on the above scenario lock/block will remain unless jerry’s session commit or rollback.

Let us now explore max_idle_blocker_time parameter and set to 1 minute at the pluggable database level:

sqlplus / as sysdba

SQL> alter session set container=ORCLPDB1;

Session altered.

SQL> alter system set max_idle_blocker_time=1;

System altered.

SQL> show parameter idle

NAME                                 TYPE        VALUE

———————————— ———– ——————————

max_idle_blocker_time                integer     1

max_idle_time                        integer     0

Repeat the scenario again….

After 1 minute in jerry’s session:

The session is terminated as expected after 1 minute

And, blocking disappears however locking from “donald” session will remain so it will require either “commit” or “rollback” for the transaction being executed.

dbms_dictionary_check package back-ported to 19c [19.22]

DBMS_DICTIONARY_CHECK package has been back-ported from 23c to oracle 19c when applying January 2024 RU [19.22]. The package is a read-only PL/SQL package procedure that helps you identify Oracle Database dictionary. its in some sense the evolution of hcheck.sql script that you need to download and install, this is not needed anymore … this package is now available as built-in is really a positive move.

To Run the Package to examine your database state:

SQL> set serveroutput on size unlimited
SQL> execute dbms_dictionary_check.full;

A detailed report will be generated as shown below:

To Check “Critical” Findings and the status:

SQL> execute dbms_dictionary_check.critical

The following are the 3 main areas that package will consider and flag “Critical”and should be in “PASS” or valid state otherwise it has negative impact and will require your attention/fix:

To Run a FULL Repair for the reported findings:

SQL> set serveroutput on size unlimited
SQL> EXECUTE dbms_dictionary_check.full(repair=>TRUE);

For further reference documentation: https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms-dictionary-check.html#GUID-BD44F6AB-437B-4351-9CAE-9190675FAD21