solrcloud


SolrCloud implementation

Objective:
Solr 7.2
Setting SolrCloud with 3 cluster (node) and these cluster will run on 3 different ports and will communicate and share data/communicate to each other in Ubuntu platform.  
Description:
There are two types of solr implementation
1)  standalone:
It is easy with single solr setup without solr cluster.i will not talk about it.
2)  SolrCloud:
Setting solr cluster (node based) on more then one server is called solrcloud. so i will setup SolrCloud
Local host setup on ubuntu:
Cteate following folders in opt directory and past solr and zookeeper files
1)  Node1 setup:
1.1)     /opt/solr/node1/solr
Past solr files in this folder
1.2 )/opt/solr/node1/zk:
Past  zookeeper setup files here
2)  Node2 setup:
2.1)/opt/solr/node2/solr
Past solr files in this folder
2.2 )/opt/solr/node2/zk:
Past  zookeeper setup files here
3)  Node3 setup:
3.1)/opt/solr/node3/solr
Past solr files in this folder
3.2) /opt/solr/node3/zk:
Past  zookeeper setup files here
4)zookeeper data folders:
used by zookeeper internally to create image files. We further past this path in zookeeper setting(zoo.cfg) file
4.1) /opt/solr/zookeeper_data/zk1/:
Create a file with name myid inside and write 1 in this file and save it as all files extention,this fille indicates zookeeper  server  1.
4.2) /opt/solr/zookeeper_data/zk2/:
Create a file with name myid inside and write 2 in this file and save it as all files extention,this fille indicates zookeeper server  2.

4.3) /opt/solr/zookeeper_data/zk3/:
Create a file with name myid inside and write 3 in this file and save it as all files extention,this fille indicates zookeeper server  3.
Settings:
Now we will make settings for solr nodes and zookeepers.
1)  Node1 settings:
1.1)     solr setting:in
opt/solr/node1/solr/server/solr/solr.xml
here port should be jetty.port:8983

1.2)     zookeeper setting:in
opt/solr/node1/zk/conf/  rename zoo_sample.cfg to zoo.cfg
and there shoud be.
clientPort=2181
dataDir=/opt/solr/zookeeper_data/zk1/

and at the bottom past it

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

2)  Node2 settings:
2.1)     solr setting:in
opt/solr/node1/solr/server/solr/solr.xml
here port should be jetty.port:8984
2.2)     zookeeper setting:in
opt/solr/node2/zk/conf/  rename zoo_sample.cfg to zoo.cfg
and there shoud be.
clientPort=2182
dataDir=/opt/solr/zookeeper_data/zk2/

and at the bottom past it

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

3)  Node3 settings:
3.1)     solr setting:in
opt/solr/node3/solr/server/solr/solr.xml
here port should be jetty.port:8984
3.2)     zookeeper setting:in
opt/solr/node3/zk/conf/  rename zoo_sample.cfg to zoo.cfg
and there shoud be.
clientPort=2183
dataDir=/opt/solr/zookeeper_data/zk3/

and at the bottom past it

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

now local setup is ready








Start zookeeper and solr:
You must start 3 zookeeper saperately then you will start 3 solr clusters using command prompt
1)  start zookeeper:
/opt/solr/node1/zk/bin/zkServer.sh start
/opt/solr/node2/zk/bin/zkServer.sh start
/opt/solr/node3/zk/bin/zkServer.sh start

2)  start solr:
not start solr from root, it will not start and throw warning otherwise use –fource argument.
here we will say to solr that on which zookeeper it will to point.
-c=>cloud mode, -p=>port,-z=>zookeeper host

/opt/solr/node1/solr/bin/solr start -c -p 8983 -z localhost:2181
/opt/solr/node2/solr/bin/solr start -c -p 8984 -z localhost:2182
/opt/solr/node3/solr/bin/solr start -c -p 8985 -z localhost:2183

Create collection:
collection is like database name in mysql and this will cary data in solr.
there is no consept of table like mysql only collection is responsible to cary data.it has two stapes to create.

1)  upload setting for collection:
Solr setup has some sample reusable settings(/opt/solr/node1/solr/server/solr/configsets/sample_techproducts_configs/conf) we will upload these settings to first zookeeper (/opt/solr/node1/zk).name of this setting will be test-config
We will not do this activity for node2 and node3 it will autometicly affect all three nodes.you can not copy past these files
use commandline and this is called upconfig.

/opt/solr/node1/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:2181 -confdir /opt/solr/node1/solr/server/solr/configsets/sample_techproducts_configs/conf/ -confname test-config

2)  create collection:
in command prompt

2.1)     curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=test_me&numShards=3&replicationFactor=1&collection.configName=test-config'

you can create one shard for one node and this in not master slave. if you want more then one shard(replica) for one node use following.
2.2)     curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=test_me&numShards=3&replicationFactor=1& maxShardsPerNode=6&collection.configName=test-config'

this will create master slave structure for solr
now collection name is test_me and we will go with point no. 2.2)

master slave in solr:
there in no consept of master slave in solr instade of this solr uses consept of leaders and followeres.if you have shard 1 and replica of shard 1 then shard 1 is leader and replica of shard 1 will be follower.it means always create replica of shard.

Create schema:
There is 2 types of schema is solr
1) managed-schema.xml:with this we can use schema API we can not manualy edit it.it is recommended.we will go with this
2) schema.xml:we can manualy edit it and schema API not supported with this.not recomonded.

To create schema use admin pannel or schema API.
I will create schema useing admin panle
now for int type value choose string/pint and for text type use text_general.create now.

ab_client_id=>string,
ab_org_type=>string,
ab_address=>text_general,
ab_modified=>pdate
Note id field will be unique key by default in managed-schema.xml i will use it here as unique key.

Note that schema is responsible for stop words and soundex(https://lucene.apache.org/solr/guide/7_2/filter-descriptions.html). its easy to do I am not covering it here

Create DIH(data import handler):
It is responsible to connet database with solr.stapes
1)  down config:
it is opposite of upconfig.now we will download config which we have uploaded to zookeeper because we have changed managed-aschema.xml using admin panal we can see  changes in these files.it is best practice to down config and make changes in any config file instade of managed-schema.xml and then upconfig and restart zookeeper and solr otherewise you can not see changes.create destination dir(tmp/conf) to download files.use cli

/opt/solr/node1/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig -zkhost 127.0.0.1:2181 -confdir /tmp/conf/ -confname test-config
2)  create DHI setting:
in /tmp/conf/solrconfig.xml
find lib dir and past following at end of lib dir line(for com.mysql.jdbc.Driver)

<lib dir="${solr.install.dir:../../../..}/lib/" regex=".*\.jar" />
<lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" />




             
Find requestHandler and past following at end of that line.
<!--add request handeler file path-->
<requestHandler name="/dataimport"
        class="org.apache.solr.handler.dataimport.DataImportHandler">
  <lst name="defaults">
    <str name="config">hiring-list-config.xml</str>
  </lst>
</requestHandler>


Now create
/tmp/conf/hiring-list-config.xml and past folling inside it

<dataConfig>
    <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://172.16.16.15/checkpoint_live_developer?useSSL=false" user="###" password="####" batchSize="-1"/>
    <document>
        <entity name="ec_solr" query="select * from ec_solr" deltaImportQuery="select * from ec_solr where id='${dih.delta.id}'" deltaQuery="select id from ec_solr s where s.modified &gt; '${dih.ab_modified}'">   
            <field column="id" name="id"/>
            <field column="client_id" name="ab_client_id"/>
            <field column="org_type" name="ab_org_type"/>
            <field column="address" name="ab_address"/>
            <field column="modified" name="ab_modified"/>
        </entity>
    </document>
</dataConfig>


3)  install com.mysql.jdbc.Driver:
extract zip and find zar file and past it in(create lib dir)
/opt/solr/node1/solr/lib/mysql-connector-java-5.1.45-bin.jar
/opt/solr/node2/solr/lib/mysql-connector-java-5.1.45-bin.jar
/opt/solr/node3/solr/lib/mysql-connector-java-5.1.45-bin.jar

4)  up config and restart solr:

now upconf then and restart zookeeper and solr
/opt/solr/node1/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost 127.0.0.1:2181 -confdir /tmp/conf/ -confname test-config

Must Restart:
/opt/solr/node1/zk/bin/zkServer.sh restart
/opt/solr/node2/zk/bin/zkServer.sh restart
/opt/solr/node3/zk/bin/zkServer.sh restart

/opt/solr/node1/solr/bin/solr restart -c -p 8983 -z localhost:2181
/opt/solr/node2/solr/bin/solr restart -c -p 8984 -z localhost:2182
/opt/solr/node3/solr/bin/solr restart -c -p 8985 -z localhost:2183

Now check out localhost: 8983 solr is running if you have any error and some thing went wrong then check solr log files using admin panel and import data using admin panel.









Comments