Challenges in getting Sitecore item which has hyphen or dash(-) in the name

Let's imagine a scenario where we want to extract all the item details from a Sitecore based on the list of page URLs provided in the CSV file.

The Url format in the CSV file:

https://{domainname}/{language}/{path-with hypen}

So we can think of using Powershell Get-Item to get all the details of URLs from the given CSV file.

we expected that all would go smoothly, but when we saw the result most of the URLs got failed by throwing the item was not found exception in the PowerShell, we thought that the issue was because of the URL provided in the CSV which has a hyphen in it(replaced all the spaces in the item name with a hyphen), it is because of the following settings.

<encodeNameReplacements>
   <replace mode="on" find="&amp;" replaceWith=",-a-," />
   <replace mode="on" find="?" replaceWith=",-q-," />
   <replace mode="on" find="/" replaceWith=",-s-," />
   <replace mode="on" find="*" replaceWith=",-w-," />
   <replace mode="on" find="." replaceWith=",-d-," />
   <replace mode="on" find=":" replaceWith=",-c-," />
   <replace mode="on" find=" " replaceWith="-" />  <!-- to replace space with dash -->
   <replace mode="on" find="_" replaceWith="-" />  <!-- to replace underscore with dash -->
</encodeNameReplacements>

so we used the PowerShell shell script to replace all the hyphen (-) with the space and tried to get the item from Sitecore, now most of the item details got retrieved, but still, some of the items failed throwing "item not found" exception.
we verified again the failed items from Sitecore, we come to know that some item names in cms itself were created with hyphens in some parts of the words.

Example:

The item in sitecore created with the following name Sultan-Al zzzzzz affirms sewerage count-down in Sharjah City
but the url provided in CSV as like Sultan-Al-zzzzzz-affirms-sewerage-count-down-in-Sharjah-City

so this breaks the Get-Item PowerShell script, as we needed an item name with hyphens and spaces in it to get the item from Sitecore,The item name should exactly match how it is created in Sitecore.


     In Code Behind accessing the item URL from CSV still throws null ,but got the item when the       item matches the name in sitecore 

But we have a challenge we have nearly 5000k urls it will be difficult to get which item name was created with a hyphen in it and which item is not.  This makes us unable to get the item details using Powershell Get-Item.

So we have checked how out of box search works in the Sitecore content editor's search box, so we have copied the item name from the URL provided in the CSV file which has a hyphen in it, The content editor is successfully able to fetch the item based on the item name provided, even though item name in CMS has both space and hyphens.

This gives us the clue to use the PowerShell script with index-based filtration to get the item details, and then using these information like id, path etc to fetch  more details using Get-Item. 

once we used the index-based filtration all item details were retrieved without any issue.







Sample script to access the CSV file URLs and extract the item details
#function to get the item from index
function getSitecoreItem($itemUrl)
{
    # get the item name from the url
    $pageUrl =$itemUrl
    $uri = [System.Uri]::new($itemUrl)
    $lastPart = $uri.Segments[-1]
    # Get the language code from the url
    $regex = [regex]"/([a-z]{2}(?:-[A-Z]{2})?)/"
    $matches = $regex.Match($itemUrl)
    if ($matches.Success) {
        $languageCode = $matches.Groups[1].Value
        Write-Output "Language Code: $languageCode"
    } else {
        Write-Output "Language Code not found"
    }
    # Define the item name you want to search for
    $itemName = $lastPart
    # Define the Sitecore index to query (master index in this example)
    $indexName = "sitecore_master_index"
    # Build the search query
    $searchString = "*$itemName*"
    # Execute the search query against the specified index
    $searchResults = Find-Item -Index $indexName -Criteria @{Filter = "Contains"; Field = "_name"; Value = $searchString}
    # Check if any results were found
    if ($searchResults -ne $null) {
         $results = [PSCustomObject]@{
                    "ItemId" = $searchResults.ItemId
                    "LanguageCode" = $languageCode
         }
        return $results    
    } else {
         return $null
        Write-Host "Item not found with name: $itemName"
    }
}

#Start processing the CSV file to get the item details
# csv file path
$csvPath = "d:\\pageurls.csv"
$csvData = Import-Csv -Path $csvPath
$itemsNotAvailableArray = @()
$itemsArray = @()
foreach ( $row in $csvData )
 {
     # Get the column name Path from the csv which has the url
     $itemPath = $row.Path  
     # Get the item from sitecore from the url
     $itemfromindex = getSitecoreItem($itemPath)
     if($itemfromindex -ne $null)
     {
        $itemversions = Get-Item -Path master: -ID $itemfromindex.ItemId -Language $itemfromindex.LanguageCode
         Write-Host $itemfromindex.ItemId
         Write-Host $itemfromindex.LanguageCode
     }
     else
     {  
         $itemversions = $null
     }    
     $items = $itemversions
      if ( $itemversions -eq $null) {
           $itemnotavailable = [PSCustomObject]@{
                    "ItemPath" = $row.Path
                }
                $itemsNotAvailableArray += $itemnotavailable
                write-host $row.Path
      }
      if ( $itemversions -ne $null) {
        foreach ($item in $itemversions) {
         
              # Create a custom object with desired item fields
                $itemAvailable = [PSCustomObject]@{
                    "ItemPath" = $item.Paths.Path
                    "ItemName" = $item.Name
                    "Title" = $item["Title"]
                    "Summary" = $item["Summary"]
                    "ItemTemplateName" = $item.TemplateName
                    "ItemLanguage" = $item.Language.Name
                    "ItemVersion" = $item.Version.Number
                }
                $itemsArray += $itemAvailable
                write-host $item.Paths.Path           
       }
    }  
   
}
$itemsArray | Export-Csv -Path "D:\latest_published_items.csv" -NoTypeInformation -Encoding UTF8
$itemsNotAvailableArray | Export-Csv -Path "D:\items_not_available.csv" -NoTypeInformation -Encoding UTF8
Write-Host "Export completed successfully."
Let's learn and grow together, happy programming 😊









Comments

Popular posts from this blog

Custom Item Url and resolving the item in Sitecore - Buckets

Fixing Sitecore Buckets folder path - Items created after 12 AM server time zone

Sitecore Search - API Crawler with Edge Pagination