In this post, we will discuss about setting up of simple flume agent using Netcat as source and Console as sink.
In this example of single-node Flume deployment, we create a Netcat source which listens on a port (localhost:44444) for network connections and logger sink type to log network traffic to console.
For sending network traffic, we can either use curl utility or traditional tool telnet. We prefer using curl in this post.
Table of Contents
Flume Agent Setup:
Step 1 – Create flume.conf file:
We have to setup the configuration properties for flume agent in a file. This configuration file needs to define the sources, the channels and the sinks grouped under the agent.
Let’s create flume.conf properties file under FLUME_CONF_DIR (FLUME_HOME/conf) location or any other preferred location and provide below properties in the file.
This configuration file defines an agent called Agent1. We can give any names to agents, sources, sinks and channels. So, in this example, we have taken descriptive names, Agent1, netcat-source, memory-channel and logger-sink.
# Name the components on this agent
Agent1.sources = netcat-source
Agent1.channels = memory-channel
Agent1.sinks = logger-sink
# Describe/configure Source
Agent1.sources.netcat-source.type = netcat
Agent1.sources.netcat-source.bind = localhost
Agent1.sources.netcat-source.port = 44444
# Describe the sink
Agent1.sinks.logger-sink.type = logger
# Use a channel which buffers events in memory
Agent1.channels.memory-channel.type = memory
Agent1.channels.memory-channel.capacity = 1000
Agent1.channels.memory-channel.transactionCapacity = 100
# Bind the source and sink to the channel
Agent1.sources.netcat-source.channels = memory-channel
Agent1.sinks.logger-sink.channel = memory-channel
We will discuss about each property of the configuration file in detail in next post until then let’s not bother bother about the details of these properties.
NOTE: We can define multiple agents in the same configuration file and can launch specified agent using a flag –name.
Step 2 – Starting Flume Agent:
We can start Flume agent with the below command by changing the appropriate values.
An agent is started using a shell script called flume-ng which is located in the bin directory of the Flume distribution. We need to specify the agent name, the config directory, and the config file on the command line:
$ flume-ng agent --conf $FLUME_CONF_DIR --conf-file $FLUME_CONF_DIR/flume.conf --name Agent1 -Dflume.root.logger=INFO,console
The agent argument in the above command (flume-ng) tells Flume to start an agent, which is the generic name for a running Flume process.
Now the agent will start running source and sinks configured in the given properties file. The agent will start and no further output will appear on that screen.
Below is the screen shot of starting agent.
In the above screen shot, we can observe the log messages stating Processing: logger-sink, Creating channel memory-channel, Channel memory-channel connected to [netcat-source, logger-sink] and finally at the bottom, Source starting and created serverSocket [/127.0.0.1:44444]. These messages are helpful while debugging if something goes wrong and the agent is not started.
In the above example, from the log messages we can confirm that agent Agent1 is started successfully and netcat-source listening for network traffic at 127.0.0.1:44444 port.
Step 3 – Feeding events to Flume Agent:
As the agent is started successfully, netcat-source is listening for events from external source at localhost:44444 port. So, lets provide the events to flume agent at specified port through curl or telnet utility
Below is the command to start curl utility with specified port number:
$ curl telnet://localhost:44444
$ telnet localhost 44444
Once connected to port, give some sample lines of input. Here we have entered below lines input and hit enter after each line feed. Press ctrl+c to close curl connection.
Hello World !!!
You Have Started a Flume Agent Successfully.
Below is the screen shot of another terminal from which have given feed to netcat-source.
Step 4 – Verify the events at Agent Console:
Now we can verify the events sent through curl connection at agent console. Below is the screen shot of agent console.
In the above screen shot, we can clearly view the events (Hello World !!!) received from netcat-source to agent console.
Step 5 – Stop Flume Agent:
Once the network event capturing is done, we can stop the agent by just pressing ctrl+c key.