Get the index of every word start index and end index in a sentence


in Help
Hi, i am trying to generate 2 new columns that gives start and end index of every word in a sentence and there is only 1 sentence.
Ex: This is an apple
| | | | | |
index: 0 3 5 8 11 15
Word | Start | End
This | 0 | 3
is | 5 | 6
an | 8 | 9
apple | 11 | 15
Can anyone please help me here.
Ex: This is an apple
| | | | | |
index: 0 3 5 8 11 15
Word | Start | End
This | 0 | 3
is | 5 | 6
an | 8 | 9
apple | 11 | 15
Can anyone please help me here.
Tagged:
0
Answers
You could use the Split operator with a space (or a more complex regular expression for separating the words) to put each word into its own attribute.
Then Loop Attributes to work on each attribute, determining its start and end position based on the length. Inside the loop you would use macros to keep track of the position for example.
Here is a solution:
This is a very limited solution. It expects the separator to be one character. You should remove characters that you don't want to count (e. g. the dot at the end of the sentence) before putting data into this process.
You should be able to take it from here.
Regards,
Balázs
I don't understand your question. Which kind of system is this, what input does it expect?
You already have the solution for getting the start and end index of the words. Do you need to pass those?
Regards,
Balázs
start = 0
for word in sentence.split():
end = start + len(word) - 1
print(word, start, end)
start = end + 2
This will give you the start and end indices for each word.