Powered by RND
PodcastsEducaciónOracle University Podcast

Oracle University Podcast

Oracle Corporation
Oracle University Podcast
Último episodio

Episodios disponibles

5 de 123
  • Oracle GoldenGate 23ai: Parameters, Data Selection, Filtering, & Transformation
    In the final episode of this series on Oracle GoldenGate 23ai, Lois Houston and Nikita Abraham welcome back Nick Wagner, Senior Director of Product Management for GoldenGate, to discuss how parameters shape data replication. This episode covers parameter files, data selection, filtering, and transformation, providing essential insights for managing GoldenGate deployments.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. --------------------------------------------------------------- Podcast Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast! I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services.  Nikita: Hi everyone! This is the last episode in our Oracle GoldenGate 23ai series. Previously, we looked at how you can manage Extract Trails and Files. If you missed that episode, do go back and give it a listen.  00:50 Lois: Today, Nick Wagner, Senior Director of Product Management for GoldenGate, is back on the podcast to tell us about parameters, data selection, filtering, and transformation. These are key components of GoldenGate because they allow us to control what data is replicated, how it's transformed, and where it's sent. Hi Nick! Thanks for joining us again. So, what are the different types of parameter files? Nick: We have a GLOBALS parameter file and your runtime parameter files. The global one is going to affect all processes within a deployment. It's going to be things like where's your checkpoint table located in name, things like the heartbeat table. You want to have a single one of these across your entire deployment, so it makes sense to keep it within a single file. We also have runtime parameter files. This are going to be associated with a specific extract or replicat process. These files are located in your OGG_ETC_HOME/conf/ogg. The GLOBALS file is just simply named GLOBALS and all capitals, and your parameter file names for the processes themselves are named with the process.prm. So if my extract process is EXT demo, my parameter file name will be extdemo.prm. When you make changes to parameter files, they don't take effect until the process is restarted. So in the case of a GLOBALS parameter file, you need to restart the administration service. And in a runtime parameter file, you need to restart that specific process before any changes will take effect. We also have what we call a managed process setting profile. And this allows you to set up auto restart profiles for each process. And the GoldenGate Gate classic architecture, this was contained within the GLOBALS parameter file and handled by the manager. And microservices is a little bit different, it's handled by the service manager itself. But now we actually set up profiles. 02:41 Nikita: Ok, so what can you tell us about the extract parameter file specifically?  Nick: There's a couple things within the extract parameter file is common use. First, we want to tell what the group name is. So in this case, it would be our extract name. We need to put in information on where the extract process is going to be writing the data it captures to and that would be our trail files, and extract process can write to one or more trail files. We also want to list out the list of tables and schemas that we're going to be capturing, as well as any kind of DDL changes. If we're doing an initial load, we want to set up the SQL predicate to determine which tables are being captured and put a WHERE clause on those to speed up performance. We can also do filtering within the extract process as well. So we write just the information that we need to the trail file. 03:27 Nikita: And what are the common parameters within an extract process? Nick: There are a couple of common parameters within your extract process. We have table to list out the list of tables that GoldenGate is going to be capturing from. These can be wildcarded. So I can simply do table.star and GoldenGate will capture all the tables in that database. I can also do schema.star and it will capture all the tables within a schema. We have our EXTTRAIL command, which tells GoldenGate which trail to write to. If I want to filter out certain rows and columns, I can use the filter cols and cols except parameter. GoldenGate can also capture sequence changes. So we would use the sequence parameter. And then we can also set some high-level database options for GoldenGate that affect all the tables and that's configured using the tranlog options parameter.  04:14 Lois: Nick, can you talk a bit about the different types of tranlogoptions settings? How can they be used to control what the extract process does? Nick: So one of the first ones is ExcludeTag. So GoldenGate has the ability to exclude tagged transactions. Within the database itself, you can actually specify a transaction to be tagged using a DBMS set tag option. GoldenGate replicat also sets its transactions with a tag so that the GoldenGate process knows which transactions were done by the replicat and it can exclude them automatically. You can do exclude tag with a plus. That simply means to exclude any transaction that's been tagged with any value. You can also exclude specific tags.  Another good option for TranLogOptions is enable procedural replication. This allows GoldenGate to actually capture and replicate database procedure calls, and this would be things like DBMS AQ, NQ operations, or DQ operations. So if you're using Oracle advanced queuing and you need GoldenGate to replicate those changes, it can.  Another valuable tranlogoption setting is enable auto capture. Within the Oracle Database, you can actually set ALTER TABLE command that says ALTER TABLE, enable logical replication. Or when you create a table, you can actually do CREATE TABLE statement and at the end use the enable logical replication option for that CREATE TABLE statement. And this tells GoldenGate to automatically capture that table. One of the nice features about this is that I don't need to specify that table and my parameter file, and it'll automatically enable supplemental logging on that table for me using scheduling columns. So it makes it very easy to set up replication between Oracle databases.  06:01 Nikita: Can you tell us about replicat parameters, Nick? Nick: Within a replicat, we'll have the group name, some common other parameters that we'll use is a mapping parameter that allows us to map the source to target table relationships. We can do transformation within the replicat, as well as error handling and controlling group operations to improve performance. Some common replicat parameters include the replicat parameter itself, which tells us what the name of that replicat is. We have our map statement, which allows us to map a source object to a target object. We have things like rep error that control how to handle errors. Insert all records allows us to change and convert, update, and delete operations into inserts. We can do things like compare calls, which helps with active-active replication in determining which columns are used in the GoldenGate WHERE clause. We also have the ability to use macros and column mapping to do additional transformation and make the parameter file look elegant. 07:07 AI is being used in nearly every industry…healthcare, manufacturing, retail, customer service, transportation, agriculture, you name it! And it’s only going to get more prevalent and transformational in the future. It’s no wonder that AI skills are the most sought-after by employers. If you’re ready to dive in to AI, check out the OCI AI Foundations training and certification that’s available for free! It’s the perfect starting point to build your AI knowledge. So, get going! Head on over to mylearn.oracle.com to find out more. 07:47 Nikita: Welcome back! Let’s move on to some of the most interesting topics within GoldenGate… data mapping, selection, and transformation. As I understand, users can do pretty cool things with GoldenGate. So Nick, let’s start with how GoldenGate can manipulate, change, and map data between two different databases. Nick: The map statement within a Replicat parameter allows you to provide specifications on how you're going to map source and target objects. You can also use a map and an extract, but it's pretty rare. And that would be used if you needed to write the object name. Inside the trail files is a different name than the actual object name that you're capturing from. GoldenGate can also do different data selection, mapping, and manipulation, and this is all controlled within the Extract and Replicat parameter files. In the classic architecture of GoldenGate, you could do a rudimentary level of transformation and filtering within the extract pump. Now, the distribution service is only allowing you to do filtering. Any transformation that you had within the pump would need to be moved to the Extract or the Replicat process.  The other thing that you can do within GoldenGate is select and filter data based on different levels and conditions. So within your parameter clause, you have your Table and Map statement. That's the core of everything. You have your filtering. You have COLS and COLSEXCEPT, which allow you to determine which columns you're going to include or exclude from replication. The Table and Map statement works at the table level. The FILTER works at the row level. And COLS and COLSEXCEPTs works at the column level. We also have the ability to filter by operation type too. So GoldenGate has some very easy parameters called GitInserts, GitUpdates, GitDeletes, and conversely ignore updates, ignore deletes, ignore inserts. And that will affect the operation type. 09:40 Lois: Nick, are there any features that GoldenGate provides to make data replication easier? Nick: The first thing is that GoldenGate is going to automatically match your source and target column names with a parameter called USEDEFAULTS. You can specify it inside of your COLMAP clause, but again, it's a default, so you don't need to worry about it. We also handle all data type and character set conversion. Because we store the metadata in the trail, we know what that source data type is like. When we go to apply the record to the target table, the Replicat process is going to look up the definition of that record and keep a repository of that in memory. So that when it knows that, hey, this value coming in from the trail file is going to be of a date data type, and then this value in the target database is going to be a character data type, it knows how to convert that date to a character, and it'll do it for you. Most of the conversion is going to be done automatically for data types. Things where we don't do automatic data type conversion is if you're using abstract data types or user-defined data types, collections arrays, and then some types of CLOB operations. For example, if you're going from a BLOB to a JSON, that's not really going to work very well. Character set conversion is also done automatically. It's not necessarily done directly by GoldenGate, but it's done by the database engine. So there is a character set value inside that source database.  And when GoldenGate goes to apply those changes into the target system, it's ensuring that that character set is visible and named so that that database can do the necessary translation. You can also do advanced filtering transformation. There's tokens that you can attach from the source environment, database, or records into a record itself on the trail file. And then there's also a bunch of metadata that GoldenGate can use to attach to the record itself. And then of course, you can use data transformation within your COLMAP statement. 11:28 Nikita: Before we wrap up, what types of data transformations can we perform, Nick?  Nick: So there's quite a few different data transformations. We can do constructive or destructive transformation, aesthetic, and structural. 11:39 Lois: That’s it for the Oracle GoldenGate 23ai: Fundamentals series. I think we covered a lot of ground this season. Thank you, Nick, for taking us through it all.  Nikita: Yeah, thank you so much, Nick. And if you want to learn more, head over to mylearn.oracle.com and search for the Oracle GoldenGate 23ai: Fundamentals course. Until next time, this is Nikita Abraham… Lois: And Lois Houston, signing off! 12:04 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
    --------  
    12:34
  • Oracle GoldenGate 23ai: Managing Extract Trails and Files
    In this episode of the Oracle University Podcast, Lois Houston and Nikita Abraham explore the intricacies of trail files in Oracle GoldenGate 23ai with Nick Wagner, Senior Director of Product Management.   They delve into how trail files store committed operations, preserving the order of transactions and capturing essential metadata. Nick explains that trail files are self-describing, containing database and table definition records, making them easier to work with. The episode also covers trail file management, including the purge trail task and the ability to download trail files directly from the web UI, providing flexibility in various deployment scenarios.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ------------------------------------------------------------- Podcast Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:25 Nikita: Welcome back to another episode of the Oracle University Podcast! I’m Nikita Abraham, Team Lead of Editorial Services with Oracle University, and I’m joined by Lois Houston, Director of Innovation Programs. Lois: Hi there! In our last episode, we discussed the Replicat process. That was a good introduction, and you should give it a listen if you’re interested in the fundamentals of GoldenGate 23ai.  00:49 Nikita: Nick Wagner, Senior Director of Product Management for Oracle GoldenGate, is back with us today to talk about how to manage Extract Trails and Files. Hi Nick, it’s a pleasure to have you with us. So, we’ve spoken about trail files in our earlier episodes. But can you tell us about the kind of information that’s actually stored in these files?  Nick: The trail files contain committed operations only. In an Oracle environment, the extract process is actually able to understand and read both committed and uncommitted transactions. It holds the uncommitted activity and the cache manager associated settings. As a transaction is committed, it's then flushing that information to the trail file. All this information in the transaction is preserved, so we have not only the transaction itself, but the order of the operations within that transaction. All the changed columns, including the primary key and any scheduling columns are also captured, and this is controlled by the log or sub calls parameter and other parameters within the extract process. The data captured depends on settings in the extract file and you can include additional information, including tokens. The trail files also contain metadata information, where the trail files are what we call self-describing, which means that as we start reading in new objects, we start writing the definition of those objects into the trail file themselves. 02:11 Lois: Nick, what does the structure of a trail file look like? Nick: The trail files have a header information, which simply keeps information about what version of trail file this is, where GoldenGate is handling it, information about that trail file itself. You'll also have three different types of records. You'll have a data record, which contains the actual before and after images, the table update statement, the type of operations. You have a database definition record, which includes information about the database that GoldenGate is capturing from, and then you'll also have a table definition record. As GoldenGate starts up and creates a trail file for the first time, it's always going to write the trail file header and associated database definition record, and then it's going to start reading data out of the source database. As it encounters a new table for the first time in that trail file, it's going to write the metadata for that object as well. This makes it very easy. This means that within a single trail file, any data records I have in there, that trail file also contains the associated table definition record for that table. 03:20 Nikita: Let’s talk about compatibility between different versions of GoldenGate. How do the trail files fit into that? Nick: The GoldenGate trail files themselves have information built into them to help understand what they're compatible with as far as GoldenGate releases. If I'm replicating from a new version of GoldenGate to an older version of GoldenGate, I can set the format release value to tell the extract process to write these trail files in an older version. In this case, I can simply say format release 19 and it'll write the trail files in the 19C version. If you're going from an older version to a newer version of GoldenGate, it's automatically able to process the old version trail file structure without having to change anything.  04:02 Nikita: Now, GoldenGate is constantly generating these trail files as it runs. So, how do we manage them over time. What’s the cleanup process like? Nick: Within the GoldenGate microservices architecture, the web UI has a way to manage your trail files and clean them up. So there's a purge trail task that allows you to go in and set up rules on how long to keep the trail files around for before they're purged. We have customers that want to reposition extract and so you'll want to make sure that you keep trail files around long enough so that you can handle any reposition that you intend to do. Trail files will always be kept around even past their purge rules if they're still needed for GoldenGate recovery. Also new to GoldenGate 23ai is the ability to download trail files directly from the web UI. This is extremely helpful if you're using OCI GoldenGate or you don't have OS access on the machine where GoldenGate is running. 04:56 Lois: What if we want to look inside these trail files and actually see the individual records? How can we examine them directly? Nick: Well, that can be seen using a tool called Logdump. Logdumps utility, that's installed in your ogghome/bindirectory. It has online help as well as full documentation. 05:14 Lois: And how do you use Logdump? Nick: So to use Logdump, the first thing you'll do is launch the service and then you'll open a trail file. You would specify the full path of the trail file along with the path name and the sequence number of that trail file. Once you've set it up, you'll position into that location within that trail file. Normally people position at record 0 and then they'll do a next, which allows them to get the next information. There's a couple other commands in there, such as POS, which allows you to set the position, scan for header, allows you to scan to the next record if you position within the middle of a record. So, when you first run Logdump, it's not going to have very much information available for you. So, you'll want to turn on a couple of settings. You'll want to enable File Header, GHDR, and Detail to be able to see more information about what's going on within that record within the trail. Logdump also has the ability to show you the actual ASCII values as opposed to the text value. This is very useful for dealing with multibyte data as well as unprintable characters. You can also specify the length of the record to show for each Logdump record. And this is in the reclen parameter, 280 is a rough number and it will usually show about enough that'll fit on a single page. 06:40 Join the Oracle University Learning Community and tap into a vibrant network of over 1 million members, including Oracle experts and fellow learners. This dynamic community is the perfect place to grow your skills, connect with likeminded learners, and celebrate your successes. As a MyLearn subscriber, you have access to engage with your fellow learners and participate in activities in the community. Visit community.oracle.com/ou to check things out today! 07:12 Nikita: Welcome back! Nick, earlier you mentioned data records in trail files. What kind of information do these records contain? Nick: When we start looking at data records within the trail file, we're going to see a little bit different format. It's going to give us information about what type of operation this was, the before, after indicator, is this an after image or a before image? It's going to give us the time information. It's going to tell us what table this record was on and the values within that record. We can also count the number of records in a trail using the count option that tells us how many records in the trail, the average size, and then the operation type breakdown. We can also get some additional details on that count, including having it broken out by table and operation within those tables. This is really useful if you're trying to track down a missing record or an out of sync condition and you want to make sure that GoldenGate is appropriately capturing all the changes. We can also use an option within Logdump called scan for metadata. The shorthand for this command is sfmd, it allows you to scan for something like a database definition record.  You may have multiple database definition records versions within the same trail file. It tells us what type of database this was, the character set, which is important because this information is used by the replica when it goes to apply changes into the target database. We can also scan for metadata to get table definition records. The data types are numeric values that are associated with an internal GoldenGate data type. 08:43 Lois: Thank you, Nick, for your insights. There’s a lot more you can find in the Oracle GoldenGate 23ai: Fundamentals course on MyLearn. So, make sure you check that out by visiting mylearn.oracle.com. Nikita: Join us next week for a discussion on parameters, data selection, filtering, and transformation. Until then, this is Nikita Abraham… Lois: And Lois Houston signing off! 09:07 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
    --------  
    9:36
  • Oracle GoldenGate 23ai: The Replicat Process
    In this episode, Lois Houston and Nikita Abraham, along with Nick Wagner, Senior Director of Product Management, dive into the Replicat process in Oracle GoldenGate 23ai.   They discuss how Replicat applies changes to the target database, highlighting the different types: Classic, Coordinated, and Parallel Replicat.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ---------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:25 Lois: Hello and welcome to another episode of the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services.  Nikita: Hi everyone! If you’ve been listening to us these last few weeks, you’ll know we’ve been discussing the fundamentals of GoldenGate 23ai. Today is going to be all about the Replicat process. Again, this is something we’ve discussed briefly in earlier episodes, but just to recap, the Replicat process applies changes from the source database to the target database. It's responsible for reading trail files and applying the changes to the target system. 01:04 Lois: That’s right, Niki. And we’ll be chatting with Nick Wagner, Senior Director of Product Management for Oracle GoldenGate. Hi Nick! Thanks for joining us again today. Let’s get straight into it. Can you give us an overview of the Replicat process? Nick: One thing that's very important is the Replicat is extremely chatty with that target database. So it's going to be going in and trying to make lots of little transactions on that system. The Replicat process only issues single row DML. So if you can imagine a source database that's generating hundreds of thousands of changes per second, we're going to have to have a Replicat process that can do 100,000 changes per second on that target site. That means that it's going to have to send a lot of little one record commands. And so we've got a lot of ways to optimize that. But in all situations you're really going to want very, very low ping time between that Replicat process and that target database. This often means that if you're going to be running GoldenGate in a cloud, you're going to want the Cloud GoldenGate environment to be running in that target data center, wherever that target database is. 02:06 Lois: What are the key characteristics of the process, Nick? Nick: Replicat process is going to read the changes from the trail file and then apply them to the target system, just like any database user would. It's not doing anything special where it's going under the covers and trying to apply directly to the database blocks. It's just applying regular standard insert, update, delete, and DDL statements to that target database. A single trail file does support high volume of data replication activity depending on the type of Replicat. Replicats do preserve the boundary of their transactions. So in the situations, by default, a transaction that's on the source, let's say five inserts followed by a commit will remain five inserts followed by a commit on the target site. There are some operations and changes that do affect this, but they're not turned on by default. There are things like group transactions that allows you to group multiple transactions into a single commit. This one could actually improve performance in some cases. We also have batch SQL that can change the boundaries of a transaction as well. And then in a Parallel Replicat, you actually have the ability to split a large transaction into multiple chunks and apply those chunks in Parallel. So again, by default, it's going to preserve the boundaries, but there are ways to change that. And then the Replicats use a checkpoint table to help with recovery and to know where they're applying data and what they've done. The other thing in here is, like an Extract process can write to multiple trails and write subsets of data to each one, a Replicat can only process a single set of trail files at once. So it's going to be attached to a specific trail file like trail file AB, and will only be able to read changes from trail file AB. If I have multiple trails that need to be applied into a target system, then I have to set up multiple Replicats to handle that. 03:54 Nikita: So, what are the different Replicat types, Nick? Nick: We have three types in the product today. We have Classic Replicat, which should really only be used for testing purposes or in environments that don't support any of the other specialized Replicats. We have Coordinated Replicat, which is a high speed apply mechanism to apply data into a target system. It does have some parallelism in it, but it's user defined parallelism. And then we have our flagship and that's Parallel Replicat. And this is the most performant lowest latency Replicat that we have. 04:25 Lois: Ok. Let’s dive a little deeper into each of them, starting with the Classic Replicat. How does it work? Nick: It's pretty straightforward. You're going to have a process that reads the trail files, and then in a single threaded fashion it's going to take the trail file logical change record, convert it to an insert, update, or delete, and then apply it into that target database. Each transaction that it does is preceded by a change to the checkpoint table. So when the transaction that the Replicat is currently doing is committed, that checkpoint table update also gets committed. That way when the Replicat restarts, it knows exactly what transaction it left off and how it last applied the record. And all the Replicats work the same way with regards to checkpoint tables. They each have their own little method of ensuring that the transaction they're applying is also reflected within the checkpoint table so that when it restarts, it knows exactly where it happened. That way, if a Replicat dies in the middle of a transaction, it can be restarted without any duplicate data or without missing data. 05:29 Did you know that Oracle University offers free courses on Oracle Cloud Infrastructure? You’ll find training on everything from multicloud, database, networking, and security to artificial intelligence and machine learning, all free for our subscribers. So, what are you waiting for? Pick a topic, head over to mylearn.oracle.com, and get started. 05:53 Nikita: Welcome back! Moving on, what about Coordinated Replicat? Nick: The Coordinated Replicat is going to read from a set of trail files. It's going to have multiple threads that do this. So you have your base thread, your coordinated thread that's going to be thread 1. It's going to process the data and apply it into that target database. You then have thread 2, 4, 5, 6, and so on. When you set up your Replicat parameter file for a Coordinated Replicat, the map commands that maps from one table on the source to a table on the target has an additional option. So you'll have an option called a range or thread range. With the range and thread range option, you can actually tell which table to go into which thread. 06:38 Lois: Can you give us an example of this? Nick: So I could say map Scott.M into thread 1 and I want Scott.Dept into thread 2. Well, this is fantastic until you realize that Scott.M and Scott.Dept have a foreign key between them or a child dependencies, parent-child relationships. What that means is that now I'm going to have to disable that foreign key on the target site, because there's no way for GoldenGate to coordinate the changes in one thread to another thread. And so you really have to be careful on how you pair your tables together. If you don't have any referential integrity on that target database, then you can use parallel coordinated Replicat to really high degrees of parallelism, and you get some very good performance out of it. Let's say that you have a table that's really got too much data for even a single thread to process, that's where the thread range comes in. And thread range command will use something like the table's primary key to split transactions on that table across multiple threads. So I can say, hey, take my table Scott.M and I want to spread transactions across threads 10, 11, 12, 13, and 14 and then spread them evenly based on the primary key. And Coordinated Replicat will do that. So you can get some very high performance numbers out of it and you can really fine tune the tables, especially if you know the amount of data coming into each one. While this does work great, we observed that a lot of customers really don't know their applications to that level of detail, and so we needed a different method to push data into that target database, where we could define the parallelism based on the database expectations. So instead of the customer having to try and figure out what are the parent-child relationships, why can't GoldenGate do it for me? And that led to Parallel Replicat.  08:26 Nikita: And what are the benefits and features of the Parallel Replicat process?  Nick: So Parallel Replicat has been around for quite a few years now. It supports most targets, it was Oracle initially, but now it's been expanded out to a lot of the non-Oracle targets and even some of the nonrelational database targets. It has absolutely the best performance of any Replicat process out there. You can use it to split large transactions as well. So if all of a sudden you have a batch job that does a million inserts followed by a single commit, I can split that across 10 threads, each thread doing 100,000 inserts. And it's aware of your transaction dependencies, that's the cool thing. So in Coordinated Replicat, you had to worry about how to split your tables up, in Parallel Replicat, we do it for you. 09:11 Lois: And how does Parallel Replicat work? Nick: So there's three main processes to the Parallel Replicat. You have your first is the mapper process. This is going to be responsible for taking the data out of the trail files and putting them into kind of our collator and scheduler box. As transactions go from the trail file, they get put into this box in memory where they're processed. There's a collator process that will look at these processes and go, OK, as they're coming in, let me read some of the data in them to determine how they can be applied in Parallel or not. And so the collator process understands the foreign key dependencies on that target database. And it's able to say, hey, I know that my two tables are these two tables, have a parent-child relationship, I need to make sure that changes on those tables go in the correct order. And so if all of a sudden you see an insert using the parent record and then another insert into the child record and they're mapped together, GoldenGate will ensure that those two transactions go serially and not parallel where they could get applied out of order. There's then a scheduler process that's going to look at this and say, OK, now that I'm taking transactions from the collator process, who's already identified whether or not transactions can be applied in parallel or serial, and I'm going to feed them off to applier processes that are ready and waiting for me to apply those changes into the database. And then the applier process is waiting for the scheduler process to send its transactions and say, OK, what's my next one? Where's the next transaction I should be working on and applying? And then the applier process is the one actually applying the changes into that target database, again, just using standard DML operations. So there's a lot of benefits to this one. You don't need to worry about your foreign key dependencies, you can leave all your foreign keys enabled. The collator process will actually use information within the trail file to determine which transactions can be applied in parallel, and which one needs to be applied serially. 11:13 Lois: Thank you, Nick, for this insightful conversation. There’s loads more to discover about the Replicat process, and you can do that by heading over to mylearn.oracle.com and searching for the Oracle GoldenGate 23ai: Fundamentals course. Nikita: In our next episode, Nick will take us through managing Extract Trails and Files. Until then, this is Nikita Abraham… Lois: And Lois Houston, signing off! 11:37 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
    --------  
    12:06
  • Oracle GoldenGate: Distribution Path, Target Initiated Path, Receiver Server, and Initial Load
    In this episode, Lois Houston and Nikita Abraham dive into key components of Oracle GoldenGate 23ai with expert insights from Nick Wagner, Senior Director of Product Management.   They break down the Distribution Service, explaining how it moves trail files between environments, replaces the classic extract pump, and ensures secure data transfer. Nick also introduces Target Initiated Paths, a method for connecting less secure environments to more secure ones, and discusses how the Receiver Service simplifies monitoring and management. The episode wraps up with a look into Initial Load, covering different methods for syncing source and target databases without downtime.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ----------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:25 Nikita: Welcome to the Oracle University Podcast! I’m Nikita Abraham, Team Lead of Editorial Services with Oracle University, and with me is Lois Houston, Director of Innovation Programs.  Lois: Hey there! Last week, we spoke about the Extract process and today we’re going to spend time discussing the Distribution Path, Target Initiated Path, Receiver Server, and Initial Load. These are all critical components of the GoldenGate architecture, and understanding how they work together is essential for successful data replication. 00:58 Nikita: To help us navigate these topics, we’ve got Nick Wagner joining us again. Nick is a Senior Director of Product Management for Oracle GoldenGate. Hi Nick! Thanks for being with us today. To kick things off, can you tell us what the distribution service is and how it works? Nick: A distribution path is used when we need to send trail files between two different GoldenGate environments. The distribution service replaces the extract pump that was used in GoldenGate classic architecture. And so the distribution service will send the trail files as they're being created to that receiver service and it will write the trail files over on the target system. The distribution service works in a kind of a streaming fashion, so it's constantly pulling the trail files that the extract is creating to see if there's any new data. As soon as it sees new data, it'll packet it up and send it across the network to the receiver service. It can use a couple of different methods to do this. The most secure and recommended method is using a WebSocket secure connection or WSS. If you're going between a microservices and a classic architecture, you can actually tell the distribution service to send it using the classic architecture method. In that case, it's the OGG option when you're configuring the distribution service. There's also some unsecured methods that would send the trail files in plain text. The receiver service is then responsible for taking that data and rewriting it into the trail file on the target site. 02:23 Lois: Nick, what are some of the key features and responsibilities of the distribution service? Nick: It's responsible for command deployment. So any time that you're going to actually make a command to the distribution service, it gets handled there directly. It can handle multiple commands concurrently. It's going to dispatch trail files to one or more receiver servers so you can actually have a single distribution path, send trail files to multiple targets. It can provide some lightweight filtering so you can decide which tables get sent to the target system. And it also is integrated in with our data streams, our pub and subscribe model that we've added in GoldenGate 23ai. 03:01 Lois: Interesting. And are there any protocols to remember when using the distribution service? Nick: We always recommend a secure WebSocket. You also have proxy support for use within cloud environments. And then if you're going to a classic architecture GoldenGate, you would use the Oracle GoldenGate protocol. So in order to communicate with the distribution service and send it commands, you can communicate directly from any web browser, client software-- installation is not required-- or you can also do it through the admin client if necessary, but you can do it directly through browsers. 03:33 Nikita: Ok, let's move on to the target initiated path. Nick, what is it and what does it do essentially? Nick: This is used when you're communicating from a less secure environment to a more secure environment. Often, this requires going through some sort of DMZ. In these situations, a connection cannot be established from the less secure environment into the more secure environment. It actually needs to be established from the more secure environment out. And so if we need to replicate data into a more secure environment, we need to actually have the target GoldenGate environment initiate that connection so that it can be established.  And that's what a target-initiated path does. 04:12 Lois: And how do you set it up? Nick: It's pretty straightforward to set up. You actually don't even need to worry about it on the source side. You actually set it up and configure it from the target. The receiver service is responsible for receiving the trail file data and writing it to the local trail file. In this situation, we have a target-initiated path created. And so that receiver service is going to write the trail files locally and the replicat is going to apply that data into that target system. 04:37 Nikita: I also want to ask you about the Receiver service. What is it really? Nick: Receiver service is pretty straightforward. It's a centrally controlled service. It allows you to view the status of your distribution path and replaces target side collectors that were available in the classic architecture of GoldenGate. You can also get statistics about the receiver service directly from the web UI.  You can get detailed information about these paths by going into the receiver service and identifying information like network details, transfer protocols, how many bytes it's received, how many bytes it's sent out. If you need to issue commands from the admin client to the receiver service, you can use the info command to get details about it. Info all will tell you everything that's running. And you can see that your receiver service is up and running. 05:28 Are you working towards an Oracle Certification this year? Join us at one of our certification prep live events in the Oracle University Learning Community. Get insider tips from seasoned experts and learn from others who have already taken their certifications. Go to community.oracle.com/ou to jump-start your journey towards certification today! 05:53 Nikita: Welcome back. In the last section of today’s episode, we’ll cover what Initial Load is. Nick, can you break down the basics for us? Nick: So, the initial load is really used when you need to synchronize the source and target systems. Because GoldenGate is designed for 24/7 environments, we need to be able to do that initial load without taking downtime on the source. And so all the methods that we talk about do not require any downtime for that source database. 06:18 Lois: How do you do the initial load? Nick: So there's a couple of different ways to do the initial load. And it really depends on what your topology is. If I'm doing like-to-like replication in a homogeneous environment, we'll say Oracle-to-Oracle, the best options are to use something that's integrated with GoldenGate, some sort of precise instantiation method that does not require HandleCollisions. That's something like a database backup and restoring it to a specific SDN or CSN value using a Database Snapshot. Or in some cases, we can use Oracle Data Pump integration with GoldenGate. There are some less precise instantiation options, which do require HandleCollisions. We also have dissimilar initial load methods. And this is typically when you're going between heterogeneous environments. When my source and target databases don't match and there isn't any kind of fast unload or fast load utility that I could use between those two databases. In almost all cases, this does require HandleCollisions to be used. 07:16 Nikita: Got it. So, with so many options available, are there any advantages to using GoldenGate's own initial load method?  Nick: While some databases do have very good fast load and unload utilities, there are some advantages to using GoldenGate's own initial load method. One, it supports heterogeneous replication environments. So if I'm going from Postgres to Oracle, it'll do all the data type transformation, character set transformation for me. It doesn't require any downtime, if certain conditions are met.  It actually performs transformation as the data is loaded, too, as well as filtering. And so any transformation that you would be doing in your normal transaction log replication or CDC replication can also go through the same transformation for the initial load process. GoldenGate's initial load process does read directly from the source tables. And it fetches the data in arrays. It also uses parallel processing to speed up the replication. It does also handle activity on the source tables during the initial load process, so you do not need to worry about quiescing that source database. And a lot of the initial load methods directly built into GoldenGate support distributed application analytics targets, including things like Databricks, Snowflake, BigQuery. 08:28 Lois: And what about its limitations? Or to put it differently, when should users consider using different methods? Nick: So the first thing to consider is system proximity. We want to make sure that the two systems we're working with are close together. Or if not, how are we going to send the data across? One thing to keep in mind, when we do the initial load, the source database is not quiesced. So if it takes an hour to do the initial load or 10 hours, it really doesn't matter to GoldenGate. So that's something to keep in mind. Even though we talk about performance of this, the performance really isn't as critical as one might suspect. So the important thing about data system proximity is the proximity to the extract and replicat processes that are going to be pulling the data out and pushing it across. And then how much data is generated? Are we talking about a database that's just a couple of gigabytes? Or are we talking about a database that's hundreds of terabytes? Do we want to consider outage time? Would it be faster to take a little bit of outage and use some other method to move the data across? What kind of outage or downtime windows do we have for these environments? And then another consideration is disk space. As we're pulling the data out of that source database, we need to have somewhere to store it. And if we don't have enough disk space, we need to run to temporary space or to use multiple external drives to be able to support it. So these are all different considerations. 09:50 Nikita: I think we can wind up our episode with that. Thanks, Nick, for giving us your insights.  Lois: If you'd like to learn more about the topics we covered today, head over to mylearn.oracle.com and check out the Oracle GoldenGate 23ai: Fundamentals course. Nikita: In our next episode, Nick will take us through the Replicat process. Until then, this is Nikita Abraham… Lois: And, Lois Houston signing off! 10:14 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
    --------  
    10:43
  • Oracle GoldenGate 23ai: The Extract Process
    The Extract process is the heart of Oracle GoldenGate 23ai, capturing data changes with precision. In this episode, Lois Houston and Nikita Abraham sit down with Nick Wagner, Senior Director of Product Management, to break down Extract’s role, architecture, and best practices.   Learn how Extract works across different setups, from running on source databases to using a Hub model for greater flexibility. Additionally, understand how trail files, parameter files, and naming conventions impact performance.   Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. -------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast! I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me today is Nikita Abraham, Team Lead: Editorial Services.  Nikita: Hi everyone! Last week, we spoke about installing GoldenGate and today, we’re diving into the Extract process. We’ve discussed it briefly in an earlier episode, but to recap, the Extract process captures changes from the source database and writes them to a trail file. 00:54 Lois: Joining us again is Nick Wagner, Senior Director of Product Management for Oracle GoldenGate. Hi Nick! Before we get into the Extract process, can you walk us through the different architecture options available for GoldenGate. Let’s start with when GoldenGate is installed on the same server as the source database. What are the benefits of this architecture? Nick: There's a couple of advantages to this. It means that GoldenGate can use the same resources on that source database. It means that you don't need another host to support the GoldenGate environment. It also means that GoldenGate can use a bequeathed connection to connect from the Extract process into the source database to make it run faster. The restrictions on this are that the Replicat process is highly communicative with the target database. What that really means is that the Replicat process is constantly doing lots of little transactions. And so the network latency between the Replicat process and the target database should really be around 4 milliseconds or less for optimal performance. So that means that a lot of people can't really run GoldenGate on the source system, even though it's an option, because they need that Replicat latency performance. And so they'll often install GoldenGate on the same server as the target database. In this case, they can use the Replicat to connect using a bequeath connection to that target system, you know that it's going to be highly performant and that latency is not going to be an issue. This works really well because the Extract process has actually been optimized to do remote capture. And so it's actually able to handle 80 milliseconds round trip ping time or less between the actual Extract process and the source database itself. And so a lot of customers will opt for this method, where they're actually running GoldenGate away from the target, or excuse me, away from the source database. 02:44 Nikita: Interesting. And is there an option where you don’t need to install GoldenGate on the actual source or target database? Nick: We also have another architecture pattern called a Hub model. And this is what you would see in something like OCI GoldenGate or OCI Marketplace, or even in third party clouds environments where you don't have the ability to install GoldenGate on the actual source or target database. In these cases, GoldenGate is just going to run on a virtual machine or an environment that you have set up specifically for GoldenGate. Now, this GoldenGate Hub doesn't need to have any database software installed. It doesn't need to have any database information on it. It's simply working as a client. So GoldenGate Extract process is a client connecting into the source database and the Replicat is a client connecting into the target database. And this really gives you a lot of flexibility. However, in some cases, there may be too much of a distance, so you won't be able to get both less than 80 milliseconds on the source side in less than 4 milliseconds on the round trip on the target side. And so in that case, you can have multiple GoldenGate Hubs. And so you would have a Hub on the Extract side and another Hub on the Replicat side. And all these are fully accessible. In this case, you'll actually use the distribution service to send the trail files from one system to another.  04:00 Lois: So, coming to the Extract process, what does it actually do?  Nick: The Extract process is configured to capture changes from that source database. In different terminology, it can subscribe to a topic if we're pulling data out of a Kafka queue or a topic or some messaging system like a JMS queue and relational database language, we're pulling database from the database transaction logs. There's a lot of different sources and targets. You can always use the GoldenGate Certification Matrix to determine which sources and targets are supported, and where we can extract data from. The capture process also connects to the source table for initial loads. When we do the initial load, instead of reading from the transaction logs, GoldenGate is actually going to do a select star on that table to get the information it needs for that load. 04:49 Lois: And what about the Extract process group?  Nick: The process group is kind of a grouping of the process itself, which is either going to be my Extract or Replicat and associated files. So in an Extract environment, we have our parameter file and a report file and our checkpoint files. The parameter file, the .prm file, is going to list out which objects we're going to capture and how we're going to capture that data. It also controls what we're going to be writing to the trail file and where that trail file exists. The report file is really just a log of what's going on in that Extract process, how it's working, what tables it's encountered. It's used for any troubleshooting to make sure everything is running smoothly. And then you also have the checkpoint files. The checkpoint files and report files should not be modified by the user, the parameter file can be. The checkpoint files are going to include information about where that process is reading from, where it's writing to, and any open transaction that it's tracking as part of the bounded recovery or cache manager functionality. 05:54 Nikita: How do you go about creating an Extract group? Nick: The Extract group can be created by doing an Add Extract command or through the UI. Each Extract must also have a unique name. On the Extract process side, there is an eight-character hard limit for the name itself. And so, you can’t have an Extract process called my Extract for today is called Nick. More than eight characters. 06:17 Lois: Nick, I was wondering, is there a simple way to identify what an Extract or Replicat is doing? Nick: If you need something to help identify what that Extract or Replicat is doing or the description of it, we do have a description field. So when you do the Add Extract or Add Replicat, there is a DESC field that allows you to add more details in. And this is really key because it allows you to put a lot more information that’s going to show up in all the log files at the service manager level. And any time you do an info on the service it’ll also bring up that description field so you can see what’s going on. That way, if you get an alert, a watch, you need to keep track of something you can easily identify what that process is doing and what it’s replicating for.  07:06 Adopting a multicloud strategy is a big step towards future-proofing your business and we’re here to help you navigate this complex landscape. With our suite of courses, you'll gain insights into network connectivity, security protocols, and the considerations of working across different cloud platforms. Start your journey to multicloud today by visiting mylearn.oracle.com. 07:32 Nikita: Welcome back! Before the break, we were talking about the description field, which helps identify what the Extract is doing. Nick, are there any best practices to keep in mind when naming a group? Nick: You also don't want to use any special characters when naming the group, especially you know things like slashes or dashes. You don't want to use spaces in them, just really stick to alphanumeric characters only. The group names are also case insensitive, so EDEPT, all capitalized is the same as edept lowercase. The other thing that you don't want to do and this isn't a hard restriction, it's just more of a friendly reminder is don't end your group with a numeric value. The report files themselves end in numeric values, so you'll have a report file, 0123456789, and so on. If you were to end your group name with a numeric value, then it can often be confused for a report file. And so you don't want to really do that. But otherwise you're free to call it whatever you want. 08:39 Lois: Got it. What about naming conventions? Are there any rules that apply? Nick: You can use whatever naming convention you want, but again, try and follow these best practices. No strange characters and don't end your process names with a numeric value. 08:53 Nikita: Can you explain the role of parameter and trail files in the Extract process? Nick: The parameter files is really where GoldenGate is doing all of its hard work. And there's two parameter files, there's the GLOBALS file that's configured at the service manager level, and then you have your actual process parameter files that are configured at your Extract, Replicat, distribution service levels. We also have our trail files. We've talked a little bit about the trail file as they're the continuous source of information for GoldenGate. They can exist on the source or target or intermediate system. An Extract process can write to multiple trail files, but a trail file can only be written to by one process at a time. A trail file also consists of a two-letter abbreviation, and then we have that nine-digit sequence number. And then the processes that read a trail file include the Distribution Path or Target Initiated Path, as well as the Replicat. And again, just as a quick reminder, all the trail files are stored in the ogg_var_home/libdata directory. This is important because they do grow pretty rapidly, and so if you need to keep track of the space used in that directory, this is an important thing to do. 10:05 Nikita: How does GoldenGate handle data extraction for different database systems? Nick: You have a couple of different options. There's for non-Oracle databases, you have a classic extract. And these read directly from the transaction logs themselves or they call an API within that target database. For example, within DB2, we actually use the DB2 API to pull data out of the transaction logs. In something like Postgres, we're getting data directly out of the transaction logs by reading it. And that's put in there by the test decoding plugin. With Oracle, it's a little bit different. Because it's actually integrated in with the database kernel itself, all the log mining is done inside the database engine. All GoldenGate is doing is connecting to that log mining area and getting the data from it. We also have the option of doing standby redo logs or downstream capture, which allows GoldenGate to read data from the standby redo logs. It's still an integrated extract. So GoldenGate isn't directly reading the standby redo logs, but it allows you to set it up in a different way. And then we have an initial load extract, which is to pull out the base data out of the tables by doing a select star on the tables. And this does not include any change data. So the initial loads and the what we call tran log extracts are separate. 11:24 Lois: Before we wrap up, can you quickly walk us through the key steps to configure an extract? Nick: So the first thing we want to do is set up a restart profile so that if the process fails, it'll automatically restart. Next thing we do is we create the extract parameter file. Then we'll go ahead and register the extract with the database. This is important in Oracle and also with a couple of other databases that we support where we actually need to tell the database engine that, hey, we're going to be setting up a process that's going to pull data out of that environment. Then we go ahead and add the extract process. We add the trail file so that it knows where to write to. And then we can go ahead and start the extract process and we'll be on our way to configuring replication. If necessary, we can configure a distribution service path as well. 12:09 Nikita: Thanks, Nick, for telling us about the Extract process. If you'd like to learn more about what we discussed today, head over to mylearn.oracle.com and check out the Oracle GoldenGate 23ai: Fundamentals course. Lois: You'll find detailed instructions to help you get started. In the next episode, we’ll go through the Distribution Path, Target Initiated Path, Receiver Server, and Initial Load. Until then, this is Lois Houston… Nikita: And Nikita Abraham, signing off! 12:37 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.
    --------  
    13:06

Más podcasts de Educación

Acerca de Oracle University Podcast

Oracle University Podcast delivers convenient, foundational training on popular Oracle technologies such as Oracle Cloud Infrastructure, Java, Autonomous Database, and more to help you jump-start or advance your career in the cloud.
Sitio web del podcast

Escucha Oracle University Podcast, Miss Honey: Slow English Podcast y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app

Oracle University Podcast: Podcasts del grupo

Aplicaciones
Redes sociales
v7.20.0 | © 2007-2025 radio.de GmbH
Generated: 7/3/2025 - 4:27:35 AM