Technology

Home/Technology

Kafka 05: Kafka Consumer with ElasticSearch

By |2019-06-20T23:45:19+10:00June 20th, 2019|Categories: Technology|

Pre Configure 1. Register Bonsai:  https://app.bonsai.io/clusters/kafkaelastic2-3806712511/console 2. Increase maximum limit: (We will demostrate batch process, so 1000 is really easy to fulfil) [crayon-5f036808e2ed1114443603/] 3. Install Maven dependency [crayon-5f036808e2ed8256972895/]     The Real Code 1. Setup ElasticSearch Client [crayon-5f036808e2edb810458963/] 2. Create a Kafka Consumer [crayon-5f036808e2ede994505693/] 3. Post records from Kafka into ElasticSearch (Bonsai) [crayon-5f036808e2ee0297582659/]   Consumer Offset Commit Strategy Things about at most once and at least once At most once: offsets are commited as soon as the message is received. If the processing goes wrong, the message will be lost and it [...]

Kafka 04: Producer Configurations

By |2019-06-12T00:17:00+10:00June 12th, 2019|Categories: Technology|Tags: |

Overview The overview of our TwitterProducer config. [crayon-5f036808e445b491228274/]     About acks [crayon-5f036808e4464323768565/] acks = 0 no response is required if the broker goes offline, we will lose data useful for data where it's ok to lose: metrics, log collection. acks = 1 leader response is requested. no replication is required. the producer may retry, if ack from leader is not received if the leader goes offline, we will lose data acks = all leader + replicas response is required added latency and safety no data loss if enough replicas     [...]

Kafka 03: Twitter Producer (Java)

By |2019-06-12T00:10:54+10:00June 11th, 2019|Categories: Technology|Tags: |

Setup Twitter Developer Account and Create an App link: https://developer.twitter.com/en/apps You need to give good reason and detail your app description.     Get Dependencies link: https://github.com/twitter/hbc [crayon-5f036808e472a240508897/] Create New Producer and Consumer Step 1: Overview [crayon-5f036808e4732041266177/] Step 2: Create a Twitter Client [crayon-5f036808e4734367695144/] Step 3: Create a Kafka Producer [crayon-5f036808e4737092725057/] Step 4: Create a Topic [crayon-5f036808e473a063715891/] Step 5: Launch Kafka Console Consumer [crayon-5f036808e473c547201173/] Full Code [crayon-5f036808e4746253316865/]

Kafka 02: Install and CLI

By |2019-06-12T00:10:38+10:00May 27th, 2019|Categories: Technology|Tags: |

Install Download and unzip https://www.apache.org/dyn/closer.cgi?path=/kafka/2.2.0/kafka_2.12-2.2.0.tgz unzip and unzip paste under C:\ Create a folder data under kafka's root dir. Create two folders: kafka and zookeeper under data folder. Change properties: config\zookeeper.properties: dataDir=C:/kafka_2.12-2.2.0/data/zookeeper config\server.properties: log.dirs=C:/kafka_2.12-2.2.0/data/kafka   Launch Kafka Add environment PATH: C:\kafka_2.12-2.2.0\bin\windows Launch Zookeeper FIRST, then start Kafka: zookeeper-server-start.bat  config\zookeeper.properties kafka-server-start.bat  config\server.properties kafka-topics 1. Create C:\kafka_2.12-2.2.0>kafka-topics --zookeeper 127.0.0.1:2181 --topic first_topic --create --partitions 3 --replication-factor 1 when we see list of command with description, means something is wrong. when create a topic, we need to specify how many partitions [...]

Kafka 01: The Theory

By |2019-06-12T00:10:44+10:00May 22nd, 2019|Categories: Technology|Tags: |

Topics, Partitions and Offsets Topics: A particular stream of data. Similar to a table in a database. You can have as many topics as you want. It is identified by name Partitions: It splits a topic. Each partition is ordered. When you create a topic, you have to define how many partitions you want. Offset: Within a partition, each message gets an auto incremental id. It's infinite. Offset only have a meaning for a specific partition, which means, offset1 in partition1, only have particular meaning for partition1, not partition2. The offsets' order would only be guaranteed within one partition. Data [...]