| View previous topic :: View next topic |
| Author |
Message |
Mitch38 ChatterBot

Joined: 12 Aug 2009 Posts: 38 Location: San Diego, CA. USA
|
Posted: Wed Oct 28, 2009 3:00 am Post subject: Input truncation |
|
|
I have a recurring scenario, where a user types a passage that is 30 or 40 words long. As we all know, unless you're in a tree, the engine will take forever to search rules if your database is large. I need a way of chopping (truncating) long inputs... my robot gives the right answer, after five minutes of thinking... But that is NOT the real world. The delay outweighs the aswer at that point. _________________ Don't worry, it will all be over soon... |
|
| Back to top |
|
 |
Mitch38 ChatterBot

Joined: 12 Aug 2009 Posts: 38 Location: San Diego, CA. USA
|
Posted: Thu Oct 29, 2009 1:24 am Post subject: |
|
|
When I say "truncation", I maen a way of limiting the number of charachers in the input string, let's say for example, 20 words, with an average of 4 charachers each, for a characher count of 80. I'm sure it's possible but I don't know enough C code. _________________ Don't worry, it will all be over soon... |
|
| Back to top |
|
 |
JonC MasterBot

Joined: 02 Apr 2008 Posts: 340 Location: Leicestershire, Great Britain
|
Posted: Thu Oct 29, 2009 9:59 pm Post subject: |
|
|
I think that there are a few approaches to thiis.
1 You could use a c# snippet on the input (once captured) to truncate the string [SubString(0,n)] is the syntax (I thnk) - see Microsoft c# info to check.
2. Write a bit of VCM code to do the same. This might be more compact overall, depending on how much c# you needed to write to do the truncation within the rule(s) compared to making a vcm call in the rule(s).
A possible advantage of using the VCM code is that you could specify how many characters the input was to be truncated to each time, or with more complex VCM code, how many words. Again, this might have speed and/or accuracy advantages overall.
3. You could do a "keyword" search on the input (example code for this is in the "example kb" and it's attached vcm - see downlads) to direct the Verbot to the correct response. |
|
| Back to top |
|
 |
Somniator MightyBot

Joined: 17 Jul 2006 Posts: 173
|
Posted: Tue Nov 03, 2009 2:09 pm Post subject: |
|
|
I created a rule like that in the "before anything else"-knowledge base:
| Code: | Rule Name: *longinput
Input Text: [blank](eos)[blank3](eos)[blank2](eos)[blank1]
Input Text: (oh) [blank](eos)[blank2](eos)[blank1]
Input Text: [blank](eos)[blank2](eos)[blank1]
Input Text: [blank](eos)[blank4](eos)[blank3](eos)[blank2](eos)[blank1]
Input Text: (oh) [blank](eos)[blank3](eos)[blank2](eos)[blank1]
Input Text: (oh) [blank](eos)[blank4](eos)[blank3](eos)[blank2](eos)[blank1]
Output Text: [blank4] ...?<send [blank4]>
Output Text: [blank4] ...?<send [blank4]>
Output Text: [blank1] ...?<send [blank1]>
Output Text: <send [blank1]>
Output Text: <send [blank1]>
Output Text: <send [blank1]>
Output Text: What? "[blank2] ...?"<send [blank2]>:-o
Output Text: Sorry? [blank2] ...?<send [blank2]>
|
(eos) is a synonym for sentence-endig punctation marks.
This rule picks out one sentence of the input, processes it and forgets the rest. Usually the most important informations of a large input are in the 1st or the last sentence, so they have priority.
It is not really satisfying but it spares time. I am dreaming of a rule that recognizes relations beween the sentences of the input. Today I'm getting outputs like: "What do you mean with 'it?"
after an input like "Okay: We're going to Heathrow Airport. It is the closest airport from here."
And for even longer inputs I created a killer-rule:
| Code: | Rule Name: sermon
Input Text: [blank](eos)[blank5](eos)[blank4](eos)[blank3](eos)[blank2](eos)[blank1]
Input Text: (I said) [blank](eos)[blank5](eos)[blank4](eos)[blank3](eos)[blank2](eos)[blank1]
Output Text: Are we chatting or giving lectures? How about making it short?
Output Text: <agent.play idle1>Oh. What did you say? Please try it again and make it short, consumable for a poor Verbot.
Output Text: <agent.play idle1>Do you try to bore me with an endless sermon, or what? |
|
|
| Back to top |
|
 |
leseur sylvain OmnipotentBot

Joined: 08 Nov 2004 Posts: 1361 Location: Suburb of Paris France
|
Posted: Tue Nov 03, 2009 4:16 pm Post subject: |
|
|
-Mitch38, about "truncation". Try this:
This flower is [wonderful]
vars["wonderful"].Length == 9
Or
vars["_input"].Length == 9
-Somniator, why do you need this "before anything else" file ?
About "I am dreaming of a rule that recognizes relations beween the sentences of the input."
In Jeanneton.vkb file, i use
one var for [article]
one for var [object]
one var for [verb]
one var for [quality]
Then, later, others for why, how, when etc etc.
Ex:
The______sky____is____blue
[article] [object] [verb] [quality]
I use always the same rules with numérics value.
-1 to 100 that's concern humanity
-101 to to 200 tht's concern animals
-201 to 300 Natrural object
etc etc
In my example,
[object] = 201
Then i trigg that with condition...
That doesn't mean that Jeanneton understands the topic, but it's a
way to stay on subject.
Thanks for your interesting rule.
Friendly.
I hope i don't make non-sens... _________________ Talk With Athena the GoddessBot
Jeanneton La Française, avec son petit panier sous son bras |
|
| Back to top |
|
 |
Somniator MightyBot

Joined: 17 Jul 2006 Posts: 173
|
Posted: Wed Nov 04, 2009 12:32 am Post subject: |
|
|
@Sylvain:
I need a "before anything else" knowledge base for formal issues, eg. filtering inputs: Double inputs can be identified before repeating always the same. Or long inputs can be processed and edited before sending the actual "input" to the core KB.
Your Jeanneton concept is quite impressive, and it seems to dive deep into language analysis!
So, after having a structure of words it should be possible to analyze them with some #c script... well, this sounds easier than it is. But it is the precondition for smarter & more intelligent bots of the future, I think! |
|
| Back to top |
|
 |
Mitch38 ChatterBot

Joined: 12 Aug 2009 Posts: 38 Location: San Diego, CA. USA
|
Posted: Mon Nov 09, 2009 2:12 am Post subject: |
|
|
Somniator, your truncation rule works well as long as there are no 'elipsis' in the input (...). Your rule counts punctuation and two elipsis kills it...
Nice strategy though. I use an underlying knoledgebase to catch junk like you do, and what we need is a 'letter counter'. Perhaps Sylvan's 'vars["blankl"].Length == x' rule can be modified for use in a secondary KB like this:
Input: [blank]
Condition:vars["blank"].Length == >100
Output: I don't have time for these long winded explinations. Give me the condensed version.
I know there is a way to use the 'greater than' (>) symbol somehow, but I'm just not that good of a programmer yet. _________________ Don't worry, it will all be over soon... |
|
| Back to top |
|
 |
Mitch38 ChatterBot

Joined: 12 Aug 2009 Posts: 38 Location: San Diego, CA. USA
|
Posted: Mon Nov 09, 2009 3:47 am Post subject: |
|
|
Sylvian, I was just going over your 'example' kb again, and you truly do think 'outside of the box'. You and/or JonC could actually write a book about this and actually get it published. Just an idea.
M _________________ Don't worry, it will all be over soon... |
|
| Back to top |
|
 |
JonC MasterBot

Joined: 02 Apr 2008 Posts: 340 Location: Leicestershire, Great Britain
|
Posted: Wed Nov 11, 2009 9:11 pm Post subject: |
|
|
A book Mitch?
HELL NO!
I am not that clever (maybe Sylvain is)!
Our joint ideas are made freely available to the forum and that's as far as I will go (apart from answering questions that is).
Also, there are many far, far more clever c# and Verbot programmers than I, so I'd never have the hubris to do it.
Thanks for the compliment, but it is undeserved for me at any rate. |
|
| Back to top |
|
 |
Mitch38 ChatterBot

Joined: 12 Aug 2009 Posts: 38 Location: San Diego, CA. USA
|
Posted: Tue Nov 24, 2009 1:46 am Post subject: Truncation Method |
|
|
I made something that works. It's not a charachter counter, but a simple word counter. So simple i almost kicked myself.
Rule Name: Word Counter
Input: * * * * * * * * * * * * * * * * * * * * * * * * *
Output: That's a lovley speech.
The asteriks count words, you can use as many as you wnat to tune it to your script. Like I said, really stupid, but it does work.
Another trick I leaned... Input filtering.
Rule name: Filter
Input:
*is that your opinion (prep) [blank]*
* fully aware of the fact that [blank]*
*you remember [blank]*
have you (senses)* (prep) [blank]
i thought we had * the fact (conj) [blank]*
Etc. (i built a library of about 50 of them)
Output: <send [blank]>
What this does is chop off unnecessary rambles and just sends the idea of the sentence.
Anyway, hope all is well with you guys.. _________________ Don't worry, it will all be over soon... |
|
| Back to top |
|
 |
JonC MasterBot

Joined: 02 Apr 2008 Posts: 340 Location: Leicestershire, Great Britain
|
Posted: Mon Mar 15, 2010 5:13 pm Post subject: |
|
|
| See also "Verbots and topics" thread. |
|
| Back to top |
|
 |
|