Crawl the content from SOAP API using Sitecore Search

Sitecore search can crawl any content from different sources like Sitemap, REST API, RSS, PDF, Word document, etc, We have a requirement to crawl the data from the Soap Service API and provide the relevant content to the user.

Let's see how easily Sitecore Search helps us to crawl the content from SOAP API.

For Example, Let's crawl the list of languages from the SOAP API  provided below and store each language's details as a separate index document in Sitecore Search

Below are the sample SOAP API details along with request and response object details
http://webservices.oorsprong.org/websamples.countryinfo/CountryInfoService.wso



Steps to configure the source crawler to crawl the content from the SOAP API

  • Create the Source with Web Crawler(Advanced), usually for all APIs we used to go with API Crawler, but in this case, API Crawler does not support SOAP API


  • Update the necessary settings Max Depth, Parallelism etc.
  • Click on the Trigger tab to configure the Trigger to provide the starting point of the URL to crawl with all essential parameters required to get the content from the SOAP API
  • Select the Trigger Type to Request
  • In the Body add the necessary input details
  • Add headers Key as Content-Type and value as text/xml; charset=utf-8
  • Add the Method to POST
  • Add an endpoint in the URL field



  • Steps to configure Document extractor to extract the content from the SOAP API
  • Click on the Tab Document Extractor to provide the necessary Name and set the Extractor Type as JS
  • Click Add Tagger to extract the languages from the soap response, each language will be added as a separate index document.
  • Let's loop through the Soap response and grab the language

  • Publish once after all the configurations have been done.
  • Once the indexing is completed go to the Content Collection tab and click Content to select the Source name Soap which we have created above to view the indexed documents.



Let's learn and grow together, happy programming 😊

Comments

Popular posts from this blog

Custom Item Url and resolving the item in Sitecore - Buckets

Sitecore Custom Rule (Action and Condition)

XM Cloud - All about XM Preview and Edge GraphQL